STL map insertion efficiency: [] vs. insert

STL map insertion efficiency: [] vs. insert - c++

There are two ways of map insertion:
m[key] = val;
Or
m.insert(make_pair(key, val));
My question is, which operation is faster?
People usually say the first one is slower, because the STL Standard at first 'insert' a default element if 'key' is not existing in map and then assign 'val' to the default element.
But I don't see the second way is better because of 'make_pair'. make_pair actually is a convenient way to make 'pair' compared to pair<T1, T2>(key, val). Anyway, both of them do two assignments, one is assigning 'key' to 'pair.first' and two is assigning 'val' to 'pair.second'. After pair is made, map inserts the element initialized by 'pair.second'.
So the first way is 1. 'default construct of typeof(val)' 2. assignment
the second way is 1. assignment 2. 'copy construct of typeof(val)'

Both accomplish different things.
m[key] = val;
Will insert a new key-value pair if the key doesn't exist already, or it will overwrite the old value mapped to the key if it already exists.
m.insert(make_pair(key, val));
Will only insert the pair if key doesn't exist yet, it will never overwrite the old value. So, choose accordingly to what you want to accomplish.
For the question what is more efficient: profile. :P Probably the first way I'd say though. The assignment (aka copy) is the case for both ways, so the only difference lies in construction. As we all know and should implement, a default construction should basically be a no-op, and thus be very efficient. A copy is exactly that - a copy. So in way one we get a "no-op" and a copy, and in way two we get two copies.
Edit: In the end, trust what your profiling tells you. My analysis was off like #Matthieu mentions in his comment, but that was my guessing. :)
Then, we have C++0x coming, and the double-copy on the second way will be naught, as the pair can simply be moved now. So in the end, I think it falls back on my first point: Use the right way to accomplish the thing you want to do.

On a lightly loaded system with plenty of memory, this code:
#include <map>
#include <iostream>
#include <ctime>
#include <string>
using namespace std;
typedef map <unsigned int,string> MapType;
const unsigned int NINSERTS = 1000000;
int main() {
MapType m1;
string s = "foobar";
clock_t t = clock();
for ( unsigned int i = 0; i < NINSERTS; i++ ) {
m1[i] = s;
}
cout << clock() - t << endl;
MapType m2;
t = clock();
for ( unsigned int i = 0; i < NINSERTS; i++ ) {
m2.insert( make_pair( i, s ) );
}
cout << clock() - t << endl;
}
produces:
1547
1453
or similar values on repeated runs. So insert is (in this case) marginally faster.

Performance wise I think they are mostly the same in general. There may be some exceptions for a map with large objects, in which case you should use [] or perhaps emplace which creates fewer temporary objects than 'insert'. See the discussion here for details.
You can, however, get a performance bump in special cases if you use the 'hint' function on the insert operator. So, looking at option 2 from here:
iterator insert (const_iterator position, const value_type& val);
the 'insert' operation can be reduced to constant time (from log(n)) if you give a good hint (often the case if you know you are adding things at the back of your map).

We have to refine the analysis by mentioning that the relative performance depends on the type(size) of the objects being copied as well.
I did a similar experiment (to nbt) with a map of (int -> set). I know it is a terrible thing to do, but, illustrative for this scenario. The "value", in this case a set of ints, has 20 elements in it.
I execute a million iterations of the []= Vs. insert operations and do RDTSC/iter-count.
[] = set | 10731 cycles
insert(make_pair<>) | 26100 cycles
It shows the magnitude of penalty added due to the copying. Of course, CPP11(move ctor's)
will change the picture.

My take on it:
Worth reminding that maps is a balanced binary tree, most of the modifications and checks take O(logN).
Depends really on the problem you are trying to solve.
1) if you just want to insert the value knowing that it is not there yet,
then [] would do two things:
a) check if the item is there or not
b) if it is not there will create pair and do what insert does (
double work of O( logN ) ), so I would use insert.
2) if you are not sure if it is there or not, then a) if you did check if the item is there by doing something like if( map.find( item ) == mp.end() ) couple of lines above somewhere, then use insert, because of double work [] would perform b) if you didn't check, then it depends, cause insert won't modify the value if it is there, [] will, otherwise they are equal

My answer is not on efficiency but on safety, which is relevant to choosing an insertion algorithm:
The [] and insert() calls would trigger destructors of the elements. This may have dangerous side effects if, say, your destructors have critical behaviors inside.
After such a hazard, I stopped relying on STL's implicit lazy insertion features and always use explicit checks if my objects have behaviors in their ctors/dtors.
See this question:
Destructor called on object when adding it to std::list

Related

Memory Allocation in C++, Using a Map of Linked Lists

The underlying data structure I am using is:
map<int, Cell> struct Cell{ char c; Cell*next; };
In effect the data structure maps an int to a linked list. The map(in this case implemented as a hashmap) ensures that finding a value in the list runs in constant time. The Linked List ensures that insertion and deletion also run in constant time. At each processing iteration I am doing something like:
Cell *cellPointer1 = new Cell;
//Process cells, build linked list
Once the list is built I put the elements Cell in map. The structure was working just fine and after my program I deallocate memory. For each Cell in the list.
delete cellPointer1
But at the end of my program I have a memory leak!!
To test memory leak I use:
#include <stdlib.h>
#include <crtdbg.h>
#define _CRTDBG_MAP_ALLOC
_CrtDumpMemoryLeaks();
I'm thinking that somewhere along the way the fact that I am putting the Cells in the map does not allow me to deallocate the memory correctly. Does anyone have any ideas on how to solve this problem?

We'll need to see your code for insertion and deletion to be sure about it.
What I'd see as a memleak-free insert / remove code would be:
( NOTE: I'm assuming you don't store the Cells that you allocate in the map )
//
// insert
//
std::map<int, Cell> _map;
Cell a; // no new here!
Cell *iter = &a;
while( condition )
{
Cell *b = new Cell();
iter->next = b;
iter = b;
}
_map[id] = a; // will 'copy' a into the container slot of the map
//
// cleanup:
//
std::map<int,Cell>::iterator i = _map.begin();
while( i != _map.end() )
{
Cell &a = i->second;
Cell *iter = a.next; // list of cells associated to 'a'.
while( iter != NULL )
{
Cell *to_delete = iter;
iter = iter->next;
delete to_delete;
}
_map.erase(i); // will remove the Cell from the map. No need to 'delete'
i++;
}
Edit: there was a comment indicating that I might not have understood the problem completely. If you insert ALL the cells you allocate in the map, then the faulty thing is that your map contains Cell, not Cell*.
If you define your map as: std::map<int, Cell *>, your problem would be solved at 2 conditions:
you insert all the Cells that you allocate in the map
the integer (the key) associated to each cell is unique (important!!)
Now the deletion is simply a matter of:
std::map<int, Cell*>::iterator i = _map.begin();
while( i != _map.end() )
{
Cell *c = i->second;
if ( c != NULL ) delete c;
}
_map.clear();

I've built almost the exact same hybrid data structure you are after (list/map with the same algorithmic complexity if I were to use unordered_map instead) and have been using it from time to time for almost a decade though it's a kind of bulky structure (something I'd use with convenience in mind more than efficiency).
It's worth noting that this is quite different from just using std::unordered_map directly. For a start, it preserves the original order in which one inserts elements. Insertion, removal, and searches are guaranteed to happen in logarithmic time (or constant time depending on whether key searching is involved and whether you use a hash table or BST), iterators do not get invalidated on insertion/removal (the main requirement I needed which made me favor std::map over std::unordered_map), etc.
The way I did it was like this:
// I use this as the iterator for my container with
// the list being the main 'focal point' while I
// treat the map as a secondary structure to accelerate
// key searches.
typedef typename std::list<Value>::iterator iterator;
// Values are stored in the list.
std::list<Value> data;
// Keys and iterators into the list are stored in a map.
std::map<Key, iterator> accelerator;
If you do it like this, it becomes quite easy. push_back is a matter of pushing back to the list and adding the last iterator to the map, iterator removal is a matter of removing the key pointed to by the iterator from the map before removing the element from the list as the list iterator, finding a key is a matter of searching the map and returning the associated value in the map which happens to be the list iterator, key removal is just finding a key and then doing iterator removal, etc.
If you want to improve all methods to constant time, then you can use std::unordered_map instead of std::map as I did here (though that comes with some caveats).
Taking an approach like this should simplify things considerably over an intrusive list-based solution where you're manually having to free memory.

Is there a reason why you are not using built-in containers like, say, STL?
Anyhow, you don't show the code where the allocation takes place, nor the map definition (is this coming from a library?).
Are you sure you deallocate all of the previously allocated Cells, starting from the last one and going backwards up to the first?

You could do this using the STL (remove next from Cell):
std::unordered_map<int,std::list<Cell>>
Or if cell only contains a char
std::unordered_map<int,std::string>
If your compiler doesn't support std::unordered_map then try boost::unordered_map.
If you really want to use intrusive data structures, have a look at Boost Intrusive.

As others have pointed out, it may be hard to see what you're doing wrong without seeing your code.
Someone should mention, however, that you're not helping yourself by overlaying two container types here.
If you're using a hash_map, you already have constant insertion and deletion time, see the related Hash : How does it work internally? post. The only exception to the O(c) lookup time is if your implementation decides to resize the container, in which case you have added overhead regardless of your linked list addition. Having two addressing schemes is only going to make things slower (not to mention buggier).
Sorry if this doesn't point you to the memory leak, but I'm sure a lot of memory leaks / bugs come from not using stl / boost containers to their full potential. Look into that first.

You need to be very careful with what you are doing, because values in a C++ map need to be copyable and with your structure that has raw pointers, you must handle your copy semantics properly.
You would be far better off using std::list where you won't need to worry about your copy semantics.
If you can't change that then at least std::map<int, Cell*> will be a bit more manageable, although you would have to manage the pointers in your map because std::map will not manage them for you.
You could of course use std::map<int, shared_ptr<Cell> >, probably easiest for you for now.
If you also use shared_ptr within your Cell object itself, you will need to beware of circular references, and as Cell will know it's being shared_ptr'd you could derive it from enable_shared_from_this
My final point will be that list is very rarely the correct collection type to use. It is the correct one to use sometimes, especially when you have an LRU cache situation and you want to move accessed elements to the end of the list fast. However that is the minority case and it probably doesn't apply here. Think of an alternative collection you really want. map< int, set<char> > perhaps? or map< int, vector< char > > ?
Your list has a lot of overheads to store a few chars

C++ std::map creation taking too long?

UPDATED:
I am working on a program whose performance is very critical. I have a vector of structs that are NOT sorted. I need to perform many search operations in this vector. So I decided to cache the vector data into a map like this:
std::map<long, int> myMap;
for (int i = 0; i < myVector.size(); ++i)
{
const Type& theType = myVector[i];
myMap[theType.key] = i;
}
When I search the map, the results of the rest of the program are much faster. However, the remaining bottleneck is the creation of the map itself (it is taking about 0.8 milliseconds on average to insert about 1,500 elements in it). I need to figure out a way to trim this time down. I am simply inserting a long as the key and an int as the value. I don't understand why it is taking this long.
Another idea I had was to create a copy of the vector (can't touch the original one) and somehow perform a faster sort than the std::sort (it takes way too long to sort it).
Edit:
Sorry everyone. I meant to say that I am creating a std::map where the key is a long and the value is an int. The long value is the struct's key value and the int is the index of the corresponding element in the vector.
Also, I did some more debugging and realized that the vector is not sorted at all. It's completely random. So doing something like a stable_sort isn't going to work out.
ANOTHER UPDATE:
Thanks everyone for the responses. I ended up creating a vector of pairs (std::vector of std::pair(long, int)). Then I sorted the vector by the long value. I created a custom comparator that only looked at the first part of the pair. Then I used lower_bound to search for the pair. Here's how I did it all:
typedef std::pair<long,int> Key2VectorIndexPairT;
typedef std::vector<Key2VectorIndexPairT> Key2VectorIndexPairVectorT;
bool Key2VectorIndexPairComparator(const Key2VectorIndexPairT& pair1, const Key2VectorIndexPairT& pair2)
{
return pair1.first < pair2.first;
}
...
Key2VectorIndexPairVectorT sortedVector;
sortedVector.reserve(originalVector.capacity());
// Assume "original" vector contains unsorted elements.
for (int i = 0; i < originalVector.size(); ++i)
{
const TheStruct& theStruct = originalVector[i];
sortedVector.insert(Key2VectorIndexPairT(theStruct.key, i));
}
std::sort(sortedVector.begin(), sortedVector.end(), Key2VectorIndexPairComparator);
...
const long keyToSearchFor = 20;
const Key2VectorIndexPairVectorT::const_iterator cItorKey2VectorIndexPairVector = std::lower_bound(sortedVector.begin(), sortedVector.end(), Key2VectorIndexPairT(keyToSearchFor, 0 /* Provide dummy index value for search */), Key2VectorIndexPairComparator);
if (cItorKey2VectorIndexPairVector->first == keyToSearchFor)
{
const int vectorIndex = cItorKey2VectorIndexPairVector->second;
const TheStruct& theStruct = originalVector[vectorIndex];
// Now do whatever you want...
}
else
{
// Could not find element...
}
This yielded a modest performance gain for me. Before the total time for my calculations were 3.75 milliseconds and now it is down to 2.5 milliseconds.

Both std::map and std::set are built on a binary tree and so adding items does dynamic memory allocation. If your map is largely static (i.e. initialized once at the start and then rarely or never has new items added or removed) you'd probably be better to use a sorted vector and a std::lower_bound to look up items using a binary search.

Maps take a lot of time for two reasons
You need to do a lot of memory allocation for your data storage
You need to perform O(n lg n) comparisons for the sort.
If you are just creating this as one batch, then throwing the whole map out, using a custom pool allocator may be a good idea here - eg, boost's pool_alloc. Custom allocators can also apply optimizations such as not actually deallocating any memory until the map's completely destroyed, etc.
Since your keys are integers, you may want to consider writing your own container based on a radix tree (on the bits of the key) as well. This may give you significantly improved performance, but since there is no STL implementation, you may need to write your own.
If you don't need to sort the data, use a hash table, such as std::unordered_map; these avoid the significant overhead needed for sorting data, and also can reduce the amount of memory allocation needed.
Finally, depending on the overall design of the program, it may be helpful to simply reuse the same map instead of recreating it over and over. Just delete and add keys as needed, rather than building a new vector, then building a new map. Again, this may not be possible in the context of your program, but if it is, it would definitely help you.

I suspect it's the memory management and tree rebalancing that's costing you here.
Obviously profiling may be able to help you pinpoint the issue.
I would suggest as a general idea to just copy the long/int data you need into another vector and since you said it's almost sorted, use stable_sort on it to finish the ordering. Then use lower_bound to locate the items in the sorted vector.

std::find is a linear scan(it has to be since it works on unsorted data). If you can sort(std::sort guaranties n log(n) behavior) the data then you can use std::binary_search to get log(n) searches. But as pointed out by others it may be copy time is the problem.

If keys are solid and short, perhaps try std::hash_map instead. From MSDN's page on hash_map Class:
The main advantage of hashing over sorting is greater efficiency; a
successful hashing performs insertions, deletions, and finds in
constant average time as compared with a time proportional to the
logarithm of the number of elements in the container for sorting
techniques.

Map creation can be a performance bottleneck (in the sense that it takes a measurable amount of time) if you're creating a large map and you're copying large chunks of data into it. You're also using the obvious (but suboptimal) way of inserting elements into a std::map - if you use something like:
myMap.insert(std::make_pair(theType.key, theType));
this should improve the insertion speed, but it will result in a slight change in behaviour if you encounter duplicate keys - using insert will result in values for duplicate keys being dropped, whereas using your method, the last element with the duplicate key will be inserted into the map.
I would also look into avoiding a making a copy of the data (for example by storing a pointer to it instead) if your profiling results determine that it's the copying of the element that is expensive. But for that you'll have to profile the code, IME guesstimates tend to be wrong...
Also, as a side note, you might want to look into storing the data in a std::set using custom comparator as your contains the key already. That however will not really result in a big speed up as constructing a set in this case is likely to be as expensive as inserting it into a map.

I'm not a C++ expert, but it seems that your problem stems from copying the Type instances, instead of a reference/pointer to the Type instances.
std::map<Type> myMap; // <-- this is wrong, since std::map requires two template parameters, not one
If you add elements to the map and they're not pointers, then I believe the copy constructor is invoked and that will certainly cause delays with a large data structure. Use the pointer instead:
std::map<KeyType, ObjectType*> myMap;
Furthermore, your example is a little confusing since you "insert" a value of type int in the map when you're expecting a value of type Type. I think you should assign the reference to the item, not the index.
myMap[theType.key] = &myVector[i];
Update:
The more I look at your example, the more confused I get. If you're using the std::map, then it should take two template types:
map<T1,T2> aMap;
So what are you REALLY mapping? map<Type, int> or something else?
It seems that you're using the Type.key member field as a key to the map (it's a valid idea), but unless key is of the same type as Type, then you can't use it as the key to the map. So is key an instance of Type??
Furthermore, you're mapping the current vector index to the key in the map, which indicates that you're just want the index to the vector so you can later access that index location fast. Is that what you want to do?
Update 2.0:
After reading your answer it seems that you're using std::map<long,int> and in that case there is no copying of the structure involved. Furthermore, you don't need to make a local reference to the object in the vector. If you just need to access the key, then access it by calling myVector[i].key.

Your building a copy of the table from the broken example you give, and not just a reference.
Why Can't I store references in an STL map in C++?
Whatever you store in the map it relies on you not changing the vector.
Try a lookup map only.
typedef vector<Type> Stuff;
Stuff myVector;
typedef std::map<long, *Type> LookupMap;
LookupMap myMap;
LookupMap::iterator hint = myMap.begin();
for (Stuff::iterator it = myVector.begin(); myVector.end() != it; ++it)
{
hint = myMap.insert(hint, std::make_pair(it->key, &*it));
}
Or perhaps drop the vector and just store it in the map??

Since your vector is already partially ordered, you may want to instead create an auxiliary array referencing (indices of) the elements in your original vector. Then you can sort the auxiliary array using Timsort which has good performance for partially sorted data (such as yours).

I think you've got some other problem. Creating a vector of 1500 <long, int> pairs, and sorting it based on the longs should take considerably less than 0.8 milliseconds (at least assuming we're talking about a reasonably modern, desktop/server type processor).
To try to get an idea of what we should see here, I did a quick bit of test code:
#include <vector>
#include <algorithm>
#include <time.h>
#include <iostream>
int main() {
const int size = 1500;
const int reps = 100;
std::vector<std::pair<long, int> > init;
std::vector<std::pair<long, int> > data;
long total = 0;
// Generate "original" array
for (int i=0; i<size; i++)
init.push_back(std::make_pair(rand(), i));
clock_t start = clock();
for (int i=0; i<reps; i++) {
// copy the original array
std::vector<std::pair<long, int> > data(init.begin(), init.end());
// sort the copy
std::sort(data.begin(), data.end());
// use data that depends on sort to prevent it being optimized away
total += data[10].first;
total += data[size-10].first;
}
clock_t stop = clock();
std::cout << "Ignore: " << total << "\n";
clock_t ticks = stop - start;
double seconds = ticks / (double)CLOCKS_PER_SEC;
double ms = seconds * 1000.0;
double ms_p_iter = ms / reps;
std::cout << ms_p_iter << " ms/iteration.";
return 0;
}
Running this on my somewhat "trailing edge" (~5 year-old) machine, I'm getting times around 0.1 ms/iteration. I'd expect searching in this (using std::lower_bound or std::upper_bound) to be somewhat faster than searching in an std::map as well (since the data in the vector is allocated contiguously, we can expect better locality of reference, leading to better cache usage).

Thanks everyone for the responses. I ended up creating a vector of pairs (std::vector of std::pair(long, int)). Then I sorted the vector by the long value. I created a custom comparator that only looked at the first part of the pair. Then I used lower_bound to search for the pair. Here's how I did it all:
typedef std::pair<long,int> Key2VectorIndexPairT;
typedef std::vector<Key2VectorIndexPairT> Key2VectorIndexPairVectorT;
bool Key2VectorIndexPairComparator(const Key2VectorIndexPairT& pair1, const Key2VectorIndexPairT& pair2)
{
return pair1.first < pair2.first;
}
...
Key2VectorIndexPairVectorT sortedVector;
sortedVector.reserve(originalVector.capacity());
// Assume "original" vector contains unsorted elements.
for (int i = 0; i < originalVector.size(); ++i)
{
const TheStruct& theStruct = originalVector[i];
sortedVector.insert(Key2VectorIndexPairT(theStruct.key, i));
}
std::sort(sortedVector.begin(), sortedVector.end(), Key2VectorIndexPairComparator);
...
const long keyToSearchFor = 20;
const Key2VectorIndexPairVectorT::const_iterator cItorKey2VectorIndexPairVector = std::lower_bound(sortedVector.begin(), sortedVector.end(), Key2VectorIndexPairT(keyToSearchFor, 0 /* Provide dummy index value for search */), Key2VectorIndexPairComparator);
if (cItorKey2VectorIndexPairVector->first == keyToSearchFor)
{
const int vectorIndex = cItorKey2VectorIndexPairVector->second;
const TheStruct& theStruct = originalVector[vectorIndex];
// Now do whatever you want...
}
else
{
// Could not find element...
}
This yielded a modest performance gain for me. Before the total time for my calculations were 3.75 milliseconds and now it is down to 2.5 milliseconds.

Find Key in Stl Hash_map

I am beginner in c++ and have some problem with hash table. I need a Hash table structure for my program. first I use boost unordered_map. it have all things that I need, but it make my program so slow. then I want to test stl hash_map, but I can't do all thing that I need. this is my first code ( this is sample)
#include <hash_map>
using namespace std;
struct eqstr
{
bool operator()(int s1, int s2) const
{
return s1==s2;
}
};
typedef stdext::hash_map< int, int, stdext::hash_compare< int, eqstr > > HashTable;
int main()
{
HashTable a;
a.insert( std::pair<int,int>( 1, 1 ) );
a.insert( std::pair<int,int>( 2, 2 ) );
a.insert( std::pair<int,int>( 4, 4 ) );
//next i want to change value of key 2 to 20
a[2] = 20;
//this code only insert pair<2,20> into a, buy when I use boost unordered_map this code modify previous key of 2
//next I try this code for delete 2 and insert new one
a.erase(2);//this code does work nothing !!!
//next I try to find 2 and delete it
HashTable::iterator i;
i = a.find(2);//this code return end, and does not work!!!
a.erase(i);//cause error
//but when I write this code, it works!!!
i=a.begin();
a.erase(i);
//and finally i write this code
for (i = a.begin(); i!=a.end(); ++i)
{
if (i->first == 2 )
break;
}
if (i!= a.end())
a.erase(i);
//and this code work
but if i want to search over my data, i use array not hash_map, why I can't access, modity and delete from hash_map with o(1)
what is my mistake, and which hash structure is fast for my program with many value modification in initializing phase. is google sparse_hash suitable for me, if it is, can give me some tutorial on it.
thanks for any help

You may look at: http://msdn.microsoft.com/en-us/library/525kffzd(VS.71).aspx
I think stdext::hash_compare< int, eqstr > is causing the problems here. Try to erase it.
Another implementation of a hash map is std::tr1::unordered_map. But I think that performance of various hash map implementation would be similar. Could you elaborate more about how slow the boost::unordered_map was? How did you use it? What for?

There are so many different varieties of hash map and many developers still write their own because you can so much more often get a higher performance in your own one, written to your own specific use, than you can from a generic one, and people tend to use hash when they want a really high performance.
The first thing you need to consider is what you really need to do and how much performance you really need, then determine whether something ready-made can meet that or whether you need to write something of your own.
If you never delete elements, for example, but just write once then constantly look-up, then you can often rehash to reduce collisions for the actual set you obtain: longer at setup time but faster at lookup.
An issue in writing your own will occur if you delete elements because it is not enough to "null" the entry, as another one may have stepped over yours as part of its collision course and now if you look that one up it will give up as "not found" as soon as it hits your null.

Sorting 1000-2000 elements with many cache misses

I have an array of 1000-2000 elements which are pointers to objects. I want to keep my array sorted and obviously I want to do this as quick as possible. They are sorted by a member and not allocated contiguously so assume a cache miss whenever I access the sort-by member.
Currently I'm sorting on-demand rather than on-add, but because of the cache misses and [presumably] non-inlining of the member access the inner loop of my quick sort is slow.
I'm doing tests and trying things now, (and see what the actual bottleneck is) but can anyone recommend a good alternative to speeding this up?
Should I do an insert-sort instead of quicksorting on-demand, or should I try and change my model to make the elements contigious and reduce cache misses?
OR, is there a sort algorithm I've not come accross which is good for data that is going to cache miss?
Edit: Maybe I worded this wrong :), I don't actually need my array sorted all the time (I'm not iterating through them sequentially for anything) I just need it sorted when I'm doing a binary chop to find a matching object, and doing that quicksort at that time (when I want to search) is currently my bottleneck, because of the cache misses and jumps (I'm using a < operator on my object, but I'm hoping that inlines in release)

Simple approach: insertion sort on every insert. Since your elements are not aligned in memory I'm guessing linked list. If so, then you could transform it into a linked list with jumps to the 10th element, the 100th and so on. This is kind of similar to the next suggestion.
Or you reorganize your container structure into a binary tree (or what every tree you like, B, B*, red-black, ...) and insert elements like you would insert them into a search tree.

Running a quicksort on each insertion is enormously inefficient. Doing a binary search and insert operation would likely be orders of magnitude faster. Using a binary search tree instead of a linear array would reduce the insert cost.
Edit: I missed that you were doing sort on extraction, not insert. Regardless, keeping things sorted amortizes sorting time over each insert, which almost has to be a win, unless you have a lot of inserts for each extraction.
If you want to keep the sort on-extract methodology, then maybe switch to merge sort, or another sort that has good performance for mostly-sorted data.

I think the best approach in your case would be changing your data structure to something logarithmic and rethinking your architecture. Because the bottleneck of your application is not that sorting thing, but the question why do you have to sort everything on each insert and try to compensate that by adding on-demand sort?.
Another thing you could try (that is based on your current implementation) is implementing an external pointer - something mapping table / function and sort those second keys, but I actually doubt it would benefit in this case.

Instead of the array of the pointers you may consider an array of structs which consist of both a pointer to your object and the sort criteria. That is:
Instead of
struct MyType {
// ...
int m_SomeField; // this is the sort criteria
};
std::vector<MyType*> arr;
You may do this:
strcut ArrayElement {
MyType* m_pObj; // the actual object
int m_SortCriteria; // should be always equal to the m_pObj->m_SomeField
};
std::vector<ArrayElement> arr;
You may also remove the m_SomeField field from your struct, if you only access your object via this array.
By such in order to sort your array you won't need to dereference m_pObj every iteration. Hence you'll utilize the cache.
Of course you must keep the m_SortCriteria always synchronized with m_SomeField of the object (in case you're editing it).

As you mention, you're going to have to do some profiling to determine if this is a bottleneck and if other approaches provide any relief.
Alternatives to using an array are std::set or std::multiset which are normally implemented as R-B binary trees, and so have good performance for most applications. You're going to have to weigh using them against the frequency of the sort-when-searched pattern you implemented.
In either case, I wouldn't recommend rolling-your-own sort or search unless you're interested in learning more about how it's done.

I would think that sorting on insertion would be better. We are talking O(log N) comparisons here, so say ceil( O(log N) ) + 1 retrieval of the data to sort with.
For 2000, it amounts to: 8
What's great about this is that you can buffer the data of the element to be inserted, that's how you only have 8 function calls to actually insert.
You may wish to look at some inlining, but do profile before you're sure THIS is the tight spot.

Nowadays you could use a set, either a std::set, if you have unique values in your structure member, or, std::multiset if you have duplicate values in you structure member.
One side note: The concept using pointers, is in general not advisable.
STL containers (if used correctly) give you nearly always an optimized performance.
Anyway. Please see some example code:
#include <iostream>
#include <array>
#include <algorithm>
#include <set>
#include <iterator>
// Demo data structure, whatever
struct Data {
int i{};
};
// -----------------------------------------------------------------------------------------
// All in the below section is executed during compile time. Not during runtime
// It will create an array to some thousands pointer
constexpr std::size_t DemoSize = 4000u;
using DemoPtrData = std::array<const Data*, DemoSize>;
using DemoData = std::array<Data, DemoSize>;
consteval DemoData createDemoData() {
DemoData dd{};
int k{};
for (Data& d : dd)
d.i = k++*2;
return dd;
}
constexpr DemoData demoData = createDemoData();
consteval DemoPtrData createDemoPtrData(const DemoData& dd) {
DemoPtrData dpd{};
for (std::size_t k{}; k < dpd.size(); ++k)
dpd[k] = &dd[k];
return dpd;
}
constexpr DemoPtrData dpd = createDemoPtrData(demoData);
// -----------------------------------------------------------------------------------------
struct Comp {bool operator () (const Data* d1, const Data* d2) const { return d1->i < d2->i; }};
using MySet = std::multiset<const Data*, Comp>;
int main() {
// Add some thousand pointers. Will be sorted according to struct member
MySet mySet{ dpd.begin(), dpd.end() };
// Extract a range of data. integer values between 42 and 52
const Data* p42 = dpd[21];
const Data* p52 = dpd[26];
// Show result
for (auto iptr = mySet.lower_bound(p42); iptr != mySet.upper_bound(p52); ++iptr)
std::cout << (*iptr)->i << '\n';
// Insert a new element
Data d1{ 47 };
mySet.insert(&d1);
// Show again
std::cout << "\n\n";
for (auto iptr = mySet.lower_bound(p42); iptr != mySet.upper_bound(p52); ++iptr)
std::cout << (*iptr)->i << '\n';
}

std::map insert or std::map find?

Assuming a map where you want to preserve existing entries. 20% of the time, the entry you are inserting is new data. Is there an advantage to doing std::map::find then std::map::insert using that returned iterator? Or is it quicker to attempt the insert and then act based on whether or not the iterator indicates the record was or was not inserted?

The answer is you do neither. Instead you want to do something suggested by Item 24 of Effective STL by Scott Meyers:
typedef map<int, int> MapType; // Your map type may vary, just change the typedef
MapType mymap;
// Add elements to map here
int k = 4; // assume we're searching for keys equal to 4
int v = 0; // assume we want the value 0 associated with the key of 4
MapType::iterator lb = mymap.lower_bound(k);
if(lb != mymap.end() && !(mymap.key_comp()(k, lb->first)))
{
// key already exists
// update lb->second if you care to
}
else
{
// the key does not exist in the map
// add it to the map
mymap.insert(lb, MapType::value_type(k, v)); // Use lb as a hint to insert,
// so it can avoid another lookup
}

The answer to this question also depends on how expensive it is to create the value type you're storing in the map:
typedef std::map <int, int> MapOfInts;
typedef std::pair <MapOfInts::iterator, bool> IResult;
void foo (MapOfInts & m, int k, int v) {
IResult ir = m.insert (std::make_pair (k, v));
if (ir.second) {
// insertion took place (ie. new entry)
}
else if ( replaceEntry ( ir.first->first ) ) {
ir.first->second = v;
}
}
For a value type such as an int, the above will more efficient than a find followed by an insert (in the absence of compiler optimizations). As stated above, this is because the search through the map only takes place once.
However, the call to insert requires that you already have the new "value" constructed:
class LargeDataType { /* ... */ };
typedef std::map <int, LargeDataType> MapOfLargeDataType;
typedef std::pair <MapOfLargeDataType::iterator, bool> IResult;
void foo (MapOfLargeDataType & m, int k) {
// This call is more expensive than a find through the map:
LargeDataType const & v = VeryExpensiveCall ( /* ... */ );
IResult ir = m.insert (std::make_pair (k, v));
if (ir.second) {
// insertion took place (ie. new entry)
}
else if ( replaceEntry ( ir.first->first ) ) {
ir.first->second = v;
}
}
In order to call 'insert' we are paying for the expensive call to construct our value type - and from what you said in the question you won't use this new value 20% of the time. In the above case, if changing the map value type is not an option then it is more efficient to first perform the 'find' to check if we need to construct the element.
Alternatively, the value type of the map can be changed to store handles to the data using your favourite smart pointer type. The call to insert uses a null pointer (very cheap to construct) and only if necessary is the new data type constructed.

There will be barely any difference in speed between the 2, find will return an iterator, insert does the same and will search the map anyway to determine if the entry already exists.
So.. its down to personal preference. I always try insert and then update if necessary, but some people don't like handling the pair that is returned.

I would think if you do a find then insert, the extra cost would be when you don't find the key and performing the insert after. It's sort of like looking through books in alphabetical order and not finding the book, then looking through the books again to see where to insert it. It boils down to how you will be handling the keys and if they are constantly changing. Now there is some flexibility in that if you don't find it, you can log, exception, do whatever you want...

If you are concerned about efficiency, you may want to check out hash_map<>.
Typically map<> is implemented as a binary tree. Depending on your needs, a hash_map may be more efficient.

I don't seem to have enough points to leave a comment, but the ticked answer seems to be long winded to me - when you consider that insert returns the iterator anyway, why go searching lower_bound, when you can just use the iterator returned. Strange.

Any answers about efficiency will depend on the exact implementation of your STL. The only way to know for sure is to benchmark it both ways. I'd guess that the difference is unlikely to be significant, so decide based on the style you prefer.

map[ key ] - let stl sort it out. That's communicating your intention most effectively.
Yeah, fair enough.
If you do a find and then an insert you're performing 2 x O(log N) when you get a miss as the find only lets you know if you need to insert not where the insert should go (lower_bound might help you there). Just a straight insert and then examining the result is the way that I'd go.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js