Map supporting multiple keys for each value in c++ - c++

I'm looking for either a ready made associative map container that supports multiple keys (aliases) that map to a single value.
If there's no ready made solution, am I likely to have to resort to using 2 separate maps or is there a better way?
It seems the std::multimap is the reverse of what I want.
This question that is much the same has an accepted answer of boost::multi_index, but I've had a look at the documentation and am completely baffled on how to use it.
If the multi_index can help me implement this, does anyone have an example??

Related

Boost flat_map container

Working on some legacy code, I am running into memory issues due mainly (I believe) to the extensive use of STL maps (particularly “maps-of-maps”.)
I am looking at Boost flat_map as a possible solution. Does anyone have any firsthand experience with flat_maps, in particular with regards improvements in speed and/or memory usage? I realize of course this can be very dependent on the types of data stored and the manner in which they are stored but still curious of folk’s actual experience.
Can anyone point me to some solid examples?
As an example: there are several cases in this code of a map-of-a-map; that is, a map where the value is another map.
By replacing the “inner” map with a pair of vectors, I reduced the memory footprint 10:1 (3G to 300M). Of course this can slow down searches but for this particular case it doesn’t seem to matter much. And it involved about a day of refactoring and careful testing.
Boost’s flat_map sounds like it might be just what I need but I can’t seem to find out much about it other than the class description on the Boost web site. Looking for some firsthand feedback.
Boost's flat_map is a binary-tree-based map implementation, except that that binary tree is stored as a (sorted) vector of key-value pairs.
You can basically figure out the answers regarding performance (relative to an std::map by yourself based on that fact:
Iterating the map or a large part of it should be super-fast, relatively
Lookup should typically be relatively fast
Adding or removing values is theoretically much slower, but in practice - assuming your key and value types are small and the number of map elements not very high - probably comparable in speed (or even better on small maps - often no allocation is necessary on insert)
etc.
In your case - maps-of-maps - you're going to lose some of the benefit of "flattening things out", since you'll have an outer map with a pointer to an inner map (if not more levels of indirection); but the flat map would at least help you reduce that. Also, supposing you have two levels of maps, you could arrange it so that you store all of the inner maps contiguously (either by constructing the inner maps appropriately or by instantiating them with your own allocator, a trickier affair); in that case, you could replace pointers to maps with map indices, reducing the amount of space they take up and making life easier for the compiler.
You might also want read Boost's documentation of flat_map; and you could also just use the force and read the source (and the source of the underlying flat_tree) - like I have; I dont actually have flat_map experience myself.
I know this is an old question, but this might be of use to someone finding this question.
I found that flat_map was a big improvement in searching, lookup and iterating large maps. The fact the map is using contiguous data in memory also makes inserting faster than you might expect due to great data locality. If you're doing more inserts than lookups in your map then it might not be for you.
Having said that, repeatedly inserting a random value into a sorted vector is faster than the same on a linked list because of the data locality - despite what Big O might tell you. (tested in VS2017 and G++ 4.8).

How to create a named_graph with Boost Graph Library?

I am currenty working with the Boost Graph Library. I need unique edges and vertices. Unfortunately the boost graphes doesn't provide this feature. So I have to check manual every time before I am inserting an edge or a vertex.
Now I've found this: http://www.boost.org/doc/libs/1_49_0/boost/graph/named_graph.hpp
I am wondering if this would help me? Because the documentation says no word about named_graph I don't know how to use it. Maybe there is someone around who could give me a little example or explenation? This would help me a lot.
Thanks in advance.
The Boost Graph Library is very flexible and allows you to choose the internal representation for your vertices and edges. If you choose a container such as std::set then you could enforce unique vertices and edges directly. Details are here: Using Adjacency List
The named_graph type allows you to index your vertices by a property you can choose yourself (for example a "string" representing a name). It effectively wraps a standard adjacency_list in a map whose key is the named property and whose values are the nodes. There is a good example of how to use it in the boost source named_vertices_test.cpp.
Not sure what you're trying to do, but you could use a std::map/std::set yourself to map from some unique property to nodes in an adjacency_list. If you just need to ensure the graph has unique nodes/edges when you make it, then this approach is straightforward and simple, and is usually the best way.
You should think about the consequences of changing the backed containers to std::set- for example the performance of many algorithms will change. There is no simple answer to which is the best container to use.

Fastest way to speed up map<string,int> .find() in c++ . Where the keys are in alphabetical order

I have a map with about 100,000 pairs . Is there any way that i can speed up searching when using find(), given that the keys are in alphabetical order. Also how should i go about doing it. I know that you can specify a new comparator when you create the map. But will that speed up the find() function at all?
Thanks in advance.
[solved] Thanks a bunch guys i have decided to go with a vector and use lower and upperbound to "snip" some of the searching.
Also i am new here is there any way to mark this question as answered , or pick a best answer?
A different comparator will only speed up find if it manages to do the comparison faster (which, for strings will usually be pretty difficult).
If you're basically inserting all the data in order, then doing the searching, it may be faster to use a std::vector with std::lower_bound or std::upper_bound.
If you don't really care about ordering, and just want to find the data as quickly as possible, you might find that std::unordered_map works better for you.
Edit: Just for the record: the way you "might find" or "may find" those things is normally by profiling. Depending on the situation, it might be enough faster that it's pretty obvious even in simple testing, so profiling isn't really necessary, but if there's (much) doubt, or you want to quantify the effect, a profiler is probably the right way to do it.
std::map is already taking advantage of the fact the keys are in alphabetical order - it guarantees that itself. You aren't going to be able to improve it by changing the comparator (one assumes it's already a reasonably efficient string comparison).
Have you considered using unordered_map (aka hash_map in various implementations pre C++11? It should be able to search in O(1) instead of O(log(n)) for std::map.
You could also look into something slightly more exotic, like a trie, but that's not part of the standard library so you'd either have to find one elsewhere or roll your own, so I'd suggest unordered_map is a good place to start.
If you're using std::find to find elements, you should switch to using map::find (you don't really say in your question.) map::find uses the fact that the map is ordered to search much faster.
If that's still not good enough, you might look into a hash container such as unordered_map rather than map.
I've put in a vote for unordered_map but I wanted to also make another point.
One of the things that can hurt performance on modern machines is poor use of the cache. A map is going to have nodes allocated all over the place and there won't be much locality of reference. Also since it has to store a bunch of pointers between nodes it will use up more memory.
At the recent Going Native 2012 conference Bjarne Stroustroup gave an interesting talk that touched on this topic. He compared vector and list performance at a task involving a lot of random insertions and deletions, where it might seem list ought to have dominated, but because of the memory size and layout issue vector was in fact the fastest by far. Take a look at his slides, starting at slide 43.
unordered_map gives you direct access to the element and so it probably means even less hopping around in memory than trying to stick your data in a vector (and thus better performance than vector) so my comment is simply an admonishment to always keep your memory access pattern in mind for performance

Efficient Dictionary lookup

For my C++ application, there is a requirement to check if a word is a valid English dictionary word or not. What is the best way to do it. Is there freely available dictionary that I can make use of. I just need a collection of all possible words. How to make this lookup least expensive. Do I need to hash it.
Use either a std::set<std::string> or a std::unordered_set<std::string>. The latter is new in C++0x and may or may not be supported by your C++ Standard Library implementation; if it does not support it, it may include a hash_set of some kind: consult your documentation to find out.
Which of these (set, which uses a binary search tree, and unordered_set, which uses a hashtable) is more efficient depends on the number of elements you are storing in the container and how your Standard Library implementation implements them. Your best bet is to try both and see which performs better for your specific scenario.
Alternatively, if the list of words is fixed, you might consider using a sorted std::vector and using std::binary_search to find words in it.
With regards to the presence of a word list, it depends on the platform.
Under Linux, /usr/share/dict/words contains a list of English words
that might meet your needs. Otherwise, there are doubtlessly such lists
available on the network.
Given the size of such lists, the most rapid access will be to load it
into a hash table. std::unsorted_set, if you have it; otherwise, many
C++ compilers come with a hash_set, although different compilers have
a slightly different interface for it, and put it in different
namespaces. If that still has performance problems, it's possible to do
better if you know the number of entries in advance (so the table never
has to grow), and implement the hash table in an std::vector (or even a
C style array); handling collisions will be a bit more complicated,
however.
Another possibility would be a trie. This will almost certainly result
in the least number of basic operations in the lookup, and is fairly
simple to implement. Typical implementations will have very poor
locality, however, which could make it slower than some of the other
solutions in actual practice (or not—the only way to know is to
implement both and measure).
I actually did this a few months ago, or something close to this. You can probably find one online for free.
Like on this website: http://wordlist.sourceforge.net/
Just put it in a text file, and compare words with what is on the list. It should be order n with n being the number of words in the list. Do you need the time complexity faster?
Hope this helps.

Is there already some std::vector based set/map implementation?

For small sets or maps, it's usually much faster to just use a sorted vector, instead of the tree-based set/map - especially for something like 5-10 elements. LLVM has some classes in that spirit, but no real adapter that would provide a std::map like interface backed up with a std::vector.
Any (free) implementation of this out there?
Edit: Thanks for all the alternative ideas, but I'm really interested in a vector based set/map. I do have specific cases where I tend to create huge amounts of sets/maps which contain usually less than 10 elements, and I do really want to have less memory pressure. Think about for example neighbor edges for a vertex in a triangle mesh, you easily wind up with 100k sets of 3-4 elements each.
I just stumbled upon your question, hope its not too late.
I recommend a great (open source) library named Loki.
It has a vector based implementation of an associative container that is a drop-in replacement for std::map, called AssocVector.
It offers better performance for accessing elements (and worst performance for insertions/deletions).
The library was written by Andrei Alexandrescu author of Modern C++ Design.
It also contains some other really nifty stuff.
If you can't find anything suitable, I would just wrap a std::vector to do sort() on insert, and implement find() using lower_bound(). It should be straight forward, and just as efficient as a custom solution.
Old post, I know, but for more recent visitors, Boost's flat_set and flat_map look like what you need. See https://theboostcpplibraries.com/boost.container for more information.
I don't know any such implementation, but there are some functions that help working with sorted vectors already in STL, such as lower_bound and upper_bound.
If the set or map truly is small, the performance gained by micro-optimizing the data structure will have little to no noticeable effects. You'll save maybe one or two memory (read: cache) lookups when searching a tiny tree vs tiny vector, which in the big picture is insignificant.
Having said that, you could give hash_map a try. Lookups by key are guaranteed to run in constant time.
Maybe you're looking for unordered map's and unordered set's. Try taking a look at the TR1 unordered containers that rely on hashing, or the Boost.Unordered container library. Underneath the interface, I'm not sure if they really do use std::vector, but I'd wager it's worth taking a look at.