C++: multiple keyed map - c++

I am searching for a (multi)map where there values are associated by different key types. Basically what was asked here for Java but for C++. Is there something like this already or do I have to implement it myself?
Another, more simple case (the above case would solve this already but there may be a more simple solution esp for this case):
I want a multimap where my values are all unique and ordered (the keys are also ordered of course) and I want to be able to do a search in the map for a specific value in O(log n) time. So I can get the associated key to a value in O(log n) time. And I can get the associated value to a key also in O(log n) time.

If you want to be able to search both by key and by value use boost.bimap.
If you need multiple keys use boost.multi-index.

Boost Multi-Index.

Related

Find common value in two maps without iterating

I have these two maps, each storing 10000+ of entries:
std::map<std::string,ObjectA> mapA;
std::map<std::string,ObjectB> mapB;
I want to retrieve only those values from the maps whose keys are present in both maps.
For example, if key "10001" is found in both mapA and mapB, then I want the corresponding objects from both the maps. Something like doing a join on SQL tables. Easiest way would be to iterate over the smaller map, and then do std::find(iter->first) in each iteration to find the keys that qualify. That would also be very expensive.
Instead, I am considering maintaining a set like this:
std::set<std::string> common;
1) Every time I insert into one of the map, I will check whether it exists in the other map. If it does, I add the key to the above common set.
2) Every time I remove an entry from one of the map, I will remove the key from common set, if it exists.
The common set always maintains the keys that are in both maps. When I want to do the join, I already have the qualifying keys. Is there a faster/better way?
The algorithm is pretty simple. First, you treat the two maps as sequences (using iterators).
If either remaining sequence is empty, you're done.
If the keys at the front of the sequence are the same, you have found a match.
If the keys differ, discard the lower (according to the map's sorting order) of the two.
You'll be iterating over both maps, which means a complexity of O(n+m), which is significantly better than the naive approach with its O(n log m) or O(m log n) complexity.

std::map's behavior on referring to a key

I am writing a program for numerical simulation by using std::map to store some key-value pairs. The map is used as storing the states evoluted during the simulation. The type of the key is a integer and the value of corresponds to the key tells how many copies are there for the same keys, i.e. std::map. For each step of the simulation, I need to calculate how many values are there for the same key, so I will check that by the following code
if (map[key]>0) {do something here with the number of copies}
However, I soon find that this code doesn't work because even there is no such key in the map, whenever you call the map[key], it will generate a placeholder for that key and set the value as zero; therefore, I always overcount the total number of keys by std::map.size(). I later change the code as follow to search the key instead
if (map.find(key)!=map.end()) {...}
So is it the only and fastest way to check if a key exists or not for a map? I am going to run the simulation for hundreds millions times and it will call above code very often to check the key. Will it be too slow to use map.find() instead? Thanks.
The find member function is probably the fastest way to find whether a key is already in the map. That said, if you don't need to iterate over items in the map in order, you might get better performance with an std::unordered_map instead.
In a std::map or hashtable (std::unordered_map), the find function is very fast, as fast as using the [] subscripting operator. In fact, it's faster when the element is not found, because it doesn't have to insert one.
I don't think there is much difference in speed for various ways to check for existence of key. On the other hand: if your keys are integers and range is known, you might just use the array.
BTW:
I got interested about the speed of simple array, vector, map and unordered map. I have written simple program, that does 100000000 container[n]++, where n is a random number in range of 0 to 10000. The results:
array: 1.27s
vector: 1.36s
unordered map: 2.6s
map: 11.6s
The overhead of loop + index calculation in this simple case is ~0.8s.
So it all depends on how much time is spent elsewhere. If it's considerably more (per 100000000 iterations) then it does not matter much what you use. But if it's not, it can be quite a difference.
you can use hash_map, it is the fastest data structures for your key-value type;
also you can use map,but it is slower than hash_map

What's the best way to search from several map<key,value>?

I have created a vector which contains several map<>.
vector<map<key,value>*> v;
v.push_back(&map1);
// ...
v.push_back(&map2);
// ...
v.push_back(&map3);
At any point of time, if a value has to be retrieved, I iterate through the vector and find the key in every map element (i.e. v[0], v[1] etc.) until it's found. Is this the best way ? I am open for any suggestion. This is just an idea I have given, I am yet to implement this way (please show if any mistake).
Edit: It's not important, in which map the element is found. In multiple modules different maps are prepared. And they are added one by one as the code progresses. Whenever any key is searched, the result should be searched in all maps combined till that time.
Without more information on the purpose and use, it might be a little difficult to answer. For example, is it necessary to have multiple map objects? If not, then you could store all of the items in a single map and eliminate the vector altogether. This would be more efficient to do the lookups. If there are duplicate entries in the maps, then the key for each value could include the differentiating information that currently defines into which map the values are put.
If you need to know which submap the key was found in, try:
unordered_set<key, pair<mapid, value>>
This has much better complexity for searching.
If the keys do not overlap, i.e., are unique througout all maps, then I'd advice a set or unordered_set with a custom comparision functor, as this will help with the lookup. Or even extend the first map with the new maps, if profiling shows that is fast enough / faster.
If the keys are not unique, go with a multiset or unordered_multiset, again with a custom comparision functor.
You could also sort your vector manually and search it with a binary_search. In any case, I advice using a tree to store all maps.
It depends on how your maps are "independently created", but if it's an option, I'd make just one global map (or multimap) object and pass that to all your creators. If you have lots of small maps all over the place, you can just call insert on the global one to merge your maps into it.
That way you have only a single object in which to perform lookup, which is reasonably efficient (O(log n) for multimap, expected O(1) for unordered_multimap).
This also saves you from having to pass raw pointers to containers around and having to clean up!

Multiple keys Hash Table (unordered_map)

I need to use multiple keys(int type) to store and retrieve a single value from a hash table. I would use multiple key to index a single item. I need fast insertion and look up for the hash table. By the way, I am not allowed to use the Boost library in the implementation.
How could I do that?
If you mean that two ints form a single key then unordered_map<std::pair<int,int>, value_type>. If you want to index the same set of data by multiple keys then look at Boost.MultiIndex.
If the key to your container is comprised of the combination of multiple ints, you could use boost::tuple as your key, to encapsulate the ints without more work on your part. This holds provided your count of key int subcomponents is fixed.
Easiest way is probably to keep a map of pointers/indexes to the elements in a list.
A few more details are needed here though, do you need to support deletion? how are the elements setup? Can you use boost::shared pointers? (rather helpful if you need to support deletion)
I'm assuming that the value object in this case is large, or there is some other reason you can't simply duplicate values in a regular map.
If its always going to be a combination for retrieval.
Then its better to form a single compound key using multiple keys.
You can do this either
Storing the key as a concatenated string of ints like
(int1,int2,int3) => data
Using a higher data type like uint64_t where in u can add individual values to form a key
// Refer comment below for the approach

set map implementation in C++

I find that both set and map are implemented as a tree. set is a binary search tree, map is a self-balancing binary search tree, such as red-black tree? I am confused about the difference about the implementation. The difference I can image are as follow
1) element in set has only one value(key), element in map has two values.
2) set is used to store and fetch elements by itself. map is used to store and fetch elements via key.
What else are important?
Maps and sets have almost identical behavior and it's common for the implementation to use the exact same underlying technique.
The only important difference is map doesn't use the whole value_type to compare, just the key part of it.
Usually you'll know right away which you need: if you just have a bool for the "value" argument to the map, you probably want a set instead.
Set is a discrete mathematics concept that, in my experience, pops up again and again in programming. The stl set class is a relatively efficient way to keep track of sets where the most common opertions are insert/remove/find.
Maps are used where objects have a unique identity that is small compared to their entire set of attributes. For example, a web page can be defined as a URL and a byte stream of contents. You could put that byte stream in a set, but the binary search process would be extremely slow (since the contents are much bigger than the URL) and you wouldn't be able to look up a web page if its contents change. The URL is the identity of the web page, so it is the key of the map.
A map is usually implemented as a set< std::pair<> >.
The set is used when you want an ordered list to quickly search for an item, basically, while a map is used when you want to retrieve a value given its key.
In both cases, the key (for map) or value (for set) must be unique. If you want to store multiple values that are the same, you would use multimap or multiset.