I want to find indices of a set efficiently. I am using unordered_map and making the inverse map like this
std::unordered_map <int, int> myHash (size);
Int i = 0;
for (it = someSet.begin(); it != someSet.end(); it++)
{
myHash.insert({*it , i++});
}
It works but it is not efficient. I did this so anytime I need the indices I could access them O(1). Performance analysis is showing me that this part became hotspot of my code.
VTune tells me that new operator is my hotspot. I guess something is happening inside the unordered_map.
It seems to me that this case should be handled efficiently. I couldn't find a good way yet. Is there a better solution? a correct constructor?
Maybe I should pass more info to the constructor. I looked up the initialize list but it is not exactly what I want.
Update: Let me add some more information. The set is not that important; I save the set in to an array (sorted). Later I need to find the index of the values which are unique. I can do it in logn but it is not fast enough. It is why I decided to use a hash. The size of the set (columns of submatrix) doesn't change after this point.
It arise from sparse matrix computation which I need to find index of submatrices in a bigger matrix. Therefore the size and the pattern of the look ups is depend on the input matrix. It works reasonable on smaller problems. I could use a lookup table but while I am planning to do it in parallel the lookup table for each thread can be expensive. I have the exact size of hash in the time of creation. I thought by sending it to the constructor it stops reallocating. I really don't understand why it is reallocating this much.
The problem is, std::unordered_map, mainly implemented as a list of vectors, is extremely cache-unfriendly, and will perform especially poorly with small keys/values (like int,int in your case), not to mention requiring tons of (re-)allocations.
As an alternative you can try a third-party hash map implementing open addressing with linear probing (a mouthful, but the underlying structure is simply a vector, i.e. much more cache-friendly). For example, Google's dense_hash_map or this: flat_hash_map. Both can be used as a drop-in replacement for unordered_map, and only additionally require to designate one int value as the "empty" key.
std::unordered_map<int, int> is often implemented as if it was
std::vector<std::list<std::par<int, int>>>
Which causes a lot of allocations and deallocations of each node, each (de-)allocation is using a lock which causes contention.
You can help it a bit by using emplace instead of insert, or you can jump out in the fantastic new world of pmr allocators. If your creation and destruction of the pmr::unordered_map is single threaded you should be able to get a lot of extra performance out of it. See Jason Turners C++ Weekly - Ep 222 - 3.5x Faster Standard Containers With PMR!, his example is a bit on the small side but you can get the general idea.
Related
So previously I only had 1 key I needed to look up, so I was able to use a map:
std::map <int, double> freqMap;
But now I need to look up 2 different keys. I was thinking of using a vector with std::pair i.e.:
std::vector <int, std::pair<int, double>> freqMap;
Eventually I need to look up both keys to find the correct value. Is there a better way to do this, or will this be efficient enough (will have ~3k entries). Also, not sure how to search using the second key (first key in the std::pair). Is there a find for the pair based on the first key? Essentially I can access the first key by:
freqMap[key1]
But not sure how to iterate and find the second key in the pair.
Edit: Ok adding the use case for clarification:
I need to look up a val based on 2 keys, a mux selection and a frequency selection. The raw data looks something like this:
Mux, Freq, Val
0, 1000, 1.1
0, 2000, 2.7
0, 10e9, 1,7
1, 1000, 2.2
1, 2500, 0.8
6, 2000, 2.2
The blanket answer to "which is faster" is generally "you have to benchmark it".
But besides that, you have a number of options. A std::map is more efficient than other data structures on paper, but not necessarily in practice. If you truly are in a situation where this is performance critical (i.e. avoid premature optimisation) try different approaches, as sketched below, and measure the performance you get (memory-wise and cpu-wise).
Instead of using a std::map, consider throwing your data into a struct, give it proper names and store all values in a simple std::vector. If you modify the data only seldom, you can optimise retrieval cost at the expense of additional insertion cost by sorting the vector according to the key you are typically using to find an entry. This will allow you to do binary search, which can be much faster than linear search.
However, linear search can be surprisingly fast on a std::vector because of both cache locality and branch prediction. Both of which you are likely to lose when dealing with a map, unordered_map or (binary searched) sorted vector. So, although O(n) sounds much more scary than, say, O(log n) for map or even O(1) for unordered_map, it can still be faster under the right conditions.
Especially if you discover that you don't have a discernible index member you can use to sort your entries, you will have to either stick to linear search in contiguous memory (i.e. vector) or invest into a doubly indexed data structure (effectively something akin to two maps or two unordered_maps). Having two indexes usually prevents you from using a single map/unordered_map.
If you can pack your data more tightly (i.e. do you need an int or would a std::uint8_t do the job?, do you need a double? etc.) you will amplify cache locality and for only 3k entries you have good chances of an unsorted vector to perform best. Although operations on an std::size_t are typically faster themselves than on smaller types, iterating over contiguous memory usually offsets this effect.
Conclusion: Try an unsorted vector, a sorted vector (+binary search), a map and an unordered_map. Do proper benchmarking (with several repetitions) and pick the fastest one. If it doesn't make a difference pick the one that is the most straight-forward to understand.
Edit: Given your example data, it sounds like the first key has an extremely small domain. As far as I can tell "Mux" seems to be limited to a small number of different values which are near each other, in such a situation you may consider using an std::array as your primary indexing structure and have a suitable lookup structure as your second one. For example:
std::array<std::vector<std::pair<std::uint64_t,double>>,10>
std::array<std::unordered_map<std::uint64_t,double>,10>
I'm building a little 2d game engine. Now I need to store the prototypes of the game objects (all type of informations). A container that will have at most I guess few thousand elements all with unique key and no elements will be deleted or added after a first load. The key value is a string.
Various threads will run, and I need to send to everyone a key(or index) and with that access other information(like a texture for the render process or sound for the mixer process) available only to those threads.
Normally I use vectors because they are way faster to accessing a known element. But I see that unordered map also usually have a constant speed if I use the ::at element access. It would make the code much cleaner and also easier to maintain because I will deal with much more understandable man made strings.
So the question is, the difference in speed between a access to a vector[n] compared to a unorderedmap.at("string") is negligible compared to his benefits?
From what I understand accessing various maps in different part of the program, with different threads running just with a "name" for me is a big deal and the speed difference isn't that great. But I'm too inexperienced to be sure of this. Although I found informations about it seem I can't really understand if I'm right or wrong.
Thank you for your time.
As an alternative, you could consider using an ordered vector because the vector itself will not be modified. You can easily write an implementation yourself with STL lower_bound etc, or use an implementation from libraries ( boost::flat_map).
There is a blog post from Scott Meyers about container performance in this case. He did some benchmarks and the conclusion would be that an unordered_mapis probably a very good choice with high chances that it will be the fastest option. If you have a restricted set of keys, you can also compute a minimal optimal hash function, e.g. with gperf
However, for these kind of problems the first rule is to measure yourself.
My problem was to find a record on a container by a given std::string type as Key access. Considering Keys that only EXISTS(not finding them was not a option) and the elements of this container are generated only at the beginning of the program and never touched thereafter.
I had huge fears unordered map was not fast enough. So I tested it, and I want to share the results hoping I've not mistaken everything.
I just hope that can help others like me and to get some feedback because in the end I'm beginner.
So, given a struct of record filled randomly like this:
struct The_Mess
{
std::string A_string;
long double A_ldouble;
char C[10];
int* intPointer;
std::vector<unsigned int> A_vector;
std::string Another_String;
}
I made a undordered map, give that A_string contain the key of the record:
std::unordered_map<std::string, The_Mess> The_UnOrdMap;
and a vector I sort by the A_string value(which contain the key):
std::vector<The_Mess> The_Vector;
with also a index vector sorted, and used to access as 3thrd way:
std::vector<std::string> index;
The key will be a random string of 0-20 characters in lenght(I wanted the worst possible scenario) containing letter both capital and normal and numbers or spaces.
So, in short our contendents are:
Unordered map I measure the time the program get to execute:
record = The_UnOrdMap.at( key ); record is just a The_Mess struct.
Sorted Vector measured statements:
low = std::lower_bound (The_Vector.begin(), The_Vector.end(), key, compare);
record = *low;
Sorted Index vector:
low2 = std::lower_bound( index.begin(), index.end(), key);
indice = low2 - index.begin();
record = The_Vector[indice];
The time is in nanoseconds and is a arithmetic average of 200 iterations. I have a vector that I shuffle at every iteration containing all the keys, and at every iteration I cycle through it and look for the key I have here in the three ways.
So this are my results:
I think the initials spikes are a fault of my testing logic(the table I iterate contains only the keys generated so far, so it only has 1-n elements). So 200 iterations of 1 key search for the first time. 200 iterations of 2 keys search the second time etc...
Anyway, it seem that in the end the best option is the unordered map, considering that is a lot less code, it's easier to implement and will make the whole program way easier to read and probably maintain/modify.
You have to think about caching as well. In case of std::vector you'll have very good cache performance when accessing the elements - when accessing one element in RAM, CPU will cache nearby memory values and this will include nearby portions of your std::vector.
When you use std::map (or std::unordered_map) this is no longer true. Maps are usually implemented as self balancing binary-search trees, and in this case values can be scattered around the RAM. This imposes great hit on cache performance, especially as maps get bigger and bigger as CPU just cannot cache the memory that you're about to access.
You'll have to run some tests and measure performance, but cache misses can greatly hurt the performance of your program.
You are most likely to get the same performance (the difference will not be measurable).
Contrary to what some people seem to believe, unordered_map is not a binary tree. The underlying data structure is a vector. As a result, cache locality does not matter here - it is the same as for vector. Granted, you are going to suffer if you have collissions due to your hashing function being bad. But if your key is a simple integer, this is not going to happen. As a result, access to to element in hash map will be exactly the same as access to the element in the vector with time spent on getting hash value for integer, which is really non-measurable.
I am making a small project program that involves inputting quotes that would be later saved into a database (in this case a .txt file). There are also commands that the user would input such as list (which shows the quote by author) and random (which displays a random quote).
Here's the structure if I would use a map (with the author string as the key):
struct Information{
string quoteContent;
vector<string> tags;
}
and here's the structure if I would use the vector instead:
struct Information{
string author;
string quoteContent;
vector<string> tags;
}
note: The largest largest number of quotes I've had in the database is 200. (imported from a file)
I was just wondering which data structure would yield better performance. I'm still pretty new to this c++ thing, so any help would be appreciated!
For your data volumes it obviously doesn't matter from a performance perspective, but multi_map will likely let you write shorter, more comprehensible and maintainable code. Regarding general performance of vector vs maps (which is good to know about but likely only becomes relevant with millions of data elements or low-latency requirements)...
vector doesn't do any automatic sorting for you, so you'd probably push_back quotes as you read them, then do one std::sort once the data's loaded, after which you can find elements very quickly by author with std::binary_search or std::lower_bound, or identify insertion positions for new quotes using e.g. std::lower_bound, but if you want to insert a new quote thereafter you have to move the existing vector elements from that position on out of the way to make room - that's relatively slow. As you're just doing a few ad-hoc insertions based on user input, the time to do that with only a few hundred quotes in the vector will be totally insignificant. For the purposes of learning programming though, it's good to understand that a multimap is arranged as a kind of branching binary tree, with pointers linking the data elements, which allows for relatively quick insertion (and deletion). For some applications following all those pointers around can be more expensive (i.e. slower) than vector's contiguous memory (which works better with CPU cache memory), but in your case the data elements are all strings and vectors of strings that will likely (unless Short String Optimisations kick in) require jumping all over memory anyway.
In general, if author is naturally a key for your data just use a multi_map... it'll do all your operations in reasonable time, maybe not the fastest but never particularly slow, unlike vector for post-data-population mid-container insertions (/deletions).
Depends on the purpose of usage. Both data-structures have their pros and cons.
Vectors
Position index at() or operator []
Find function not present You would have to use find algorithm func.
Maps:
Key can be searched
Position index is not applicable. Keys are stored
(use unordered map for better performance than map.)
Use datastructure on basis of what you want to achieve.
The golden rule is: "When in doubt, measure."
i.e. Write some tests, do some benchmarking.
Anyway, considering that you have circa 200 items, I don't think there should be an important difference from the two cases on modern PC hardware. Big-O notation matters when N is big (e.g. 10,000s, 100,000s, 1,000,000s, etc.)
vector tends to be simpler than map, and I'd use it as the default container of choice (unless your main goal is to access the items given the author's name as a key, in this case map seems more logically suited).
Another option might be to have a vector with items sorted using author's names, so you can use binary search (which is O(logN)) inside the vector.
I'm developing game. I store my game-objects in this map:
std::map<std::string, Object*> mObjects;
std::string is a key/name of object to find further in code. It's very easy to point some objects, like: mObjects["Player"] = .... But I'm afraid it's to slow due to allocation of std::string in each searching in that map. So I decided to use int as key in that map.
The first question: is that really would be faster?
And the second, I don't want to remove my current type of objects accesing, so I found the way: store crc string calculating as key. For example:
Object *getObject(std::string &key)
{
int checksum = GetCrc32(key);
return mObjects[checksum];
}
Object *temp = getOject("Player");
Or this is bad idea? For calculating crc I would use boost::crc. Or this is bad idea and calculating of checksum is much slower than searching in map with key type std::string?
Calculating a CRC is sure to be slower than any single comparison of strings, but you can expect to do about log2N comparisons before finding the key (e.g. 10 comparisons for 1000 keys), so it depends on the size of your map. CRC can also result in collisions, so it's error prone (you could detect collisions relatively easily detect, and possibly even handle them to get correct results anyway, but you'd have to be very careful to get it right).
You could try an unordered_map<> (possibly called hash_map) if your C++ environment provides one - it may or may not be faster but won't be sorted if you iterate. Hash maps are yet another compromise:
the time to hash is probably similar to the time for your CRC, but
afterwards they can often seek directly to the value instead of having to do the binary-tree walk in a normal map
they prebundle a bit of logic to handle collisions.
(Silly point, but if you can continue to use ints and they can be contiguous, then do remember that you can replace the lookup with an array which is much faster. If the integers aren't actually contiguous, but aren't particularly sparse, you could use a sparse index e.g. array of 10000 short ints that are indices into 1000 packed records).
Bottom line is if you care enough to ask, you should implement these alternatives and benchmark them to see which really works best with your particular application, and if they really make any tangible difference. Any of them can be best in certain circumstances, and if you don't care enough to seriously compare them then it clearly means any of them will do.
For the actual performance you need to profile the code and see it. But I would be tempted to use hash_map. Although its not part of the C++ standard library most of the popular implentations provide it. It provides very fast lookup.
The first question: is that really would be faster?
yes - you're comparing an int several times, vs comparing a potentially large map of strings of arbitrary length several times.
checksum: Or this is bad idea?
it's definitely not guaranteed to be unique. it's a bug waiting to bite.
what i'd do:
use multiple collections and embrace type safety:
// perhaps this simplifies things enough that t_player_id can be an int?
std::map<t_player_id, t_player> d_players;
std::map<t_ghoul_id, t_ghoul> d_ghouls;
std::map<t_carrot_id, t_carrot> d_carrots;
faster searches, more type safety. smaller collections. smaller allocations/resizes.... and on and on... if your app is very trivial, then this won't matter. use this approach going forward, and adjust after profiling/as needed for existing programs.
good luck
If you really want to know you have to profile your code and see how long does the function getObject take. Personally I use valgrind and KCachegrind to profile and render data on UNIX system.
I think using id would be faster. It's faster to compare int than string so...
Correct me I'm wrong but std::map is an ordered map, thus each time I insert a value the map uses an algorithm to sort its items internally, which takes some time.
My application gets information regarding some items on a constant interval.
This app keeps a map which is defined like this:
::std::map<DWORD, myItem*>
At first all items are considered "new" to the app. An "Item" object is being allocated and added to this map, associating its id and a pointer to it.
When it's not a "new" item (just an update of this object) my app should find the object at the map, using the given id, and update.
Most of the times I get updates.
My question is:
Is there any faster map implementation or should I keep using this one?
Am I better use unordered_map?
Am I better use unordered_map?
Possibly.
std:map provides consistent performance at O(log n) because it needs to be implemented as a balanced tree. But std:unordered_map will be implemented as a hash table which might give you O(1) performance (good hash function and distribution of keys across hash buckets), but it could be O(n) (everything in one hash bucket and devolves to a list). One would normally expect something inbetween these extremes.
So you can have reasonable performance (O(log n)) all the time, or you need to ensure everything lines up to get good performance with a hash.
As with any such question: you need to measure before committing to one approach. Unless your datasets are large you might find there is no significant difference.
Important warning: Unless you have measured (and your question suggests that you haven't) that map performance substantially influences your application performance (large percentage of time is spent on searching and updating the map) don't bother with making it faster.
Stick to std::map (or std::unordered_map or any available hash_map implementation).
Speeding up your application by 1% probably will not be worth the effort.
Make it bug free instead.
Echoing Richard's answer: measure performance with different map implementation using your real classes and real data.
Some additional notes:
Understand the difference between expected cost (hash maps usually have it lower), worst case cost (O(logn) for balanced binary tree but much higher for hash map if insert triggers reallocation of hash array) and amortized cost (total cost divided by number of operations or elements; depends on things like ratio of new and existing elements). You need to find out which is more constraining in your case. For example reallocating of hash maps can be too much if you need to adhere to very low latency limit.
Find out where real bottleneck is. It might be that cost of searching in map is insignificant compared to e.g. IO cost.
Try more specialized map implementation. For example a lot can be gained if you know something more about map's key. Authors of generic map implementations do not have such knowledge.
In your example (32 bit unsigned integer keys which strongly cluster, e.g. are assigned sequentially) you can use radix based approach. Very simple example (threat it as an illustration, not ready to use recipe):
Item *sentinel[65536]; // sentinel page, initialized to NULLs.
Item (*pages[65536])[65536]; // list of pages,
// initialized so every element points to sentinel
Then search is as simple as:
Item *value = pages[index >> 16][index & 0xFFFF];
When you need to set new value:
if (pages[index >> 16] == sentinel) {
pages[index >> 16] = allocate_new_null_filled_page();
}
pages[index >> 16][index & 0xFFFF] = value;
Tweak your map implementation.
E.g. every hash_map likes to know approximate number of elements in advance. It helps avoid unnecessary reallocation of hash table and (possibly) rehashing of all keys.
With my specialized example above you certainly would try different page sizes, or three level version.
Common optimization is providing specialized memory allocator to avoid multiple allocations of small objects.
Whenever you insert or delete item, the memory allocation/deallocation costs a lot. Instead you can use an allocator like this one: https://github.com/moya-lang/Allocator which speeds up std::map twice as author says, but I found it even faster especially for other STL containers.