symmetrical associative array search by key and value - c++

I need to describe an associative array in which to search, you can use the key and value. With functions add, delete, getBy1st (search by key), getBy2nd (search by value).
For example in C++:
symmap<std::string, int> m;
m.insert(make_pair<std::string,int> ("hello", 1));
m.insert(make_pair<std::string,int> ("wow", 2));
...
m.getBy1st("hello"); // returns 1
m.getBy2nd(2);// returns "wow"
It should work for O(log(n)) and store in std::pair .
I can not decide what the data structure used to store.
Maybe i can use some variation of rb-tree to store it?

This sounds a lot like Boost.Bimap.

Why not use a pair of hashtables to store the data - one hashing from T1 to T2 and the other hashing in the other direction?

Related

Ordered and unordered map

I am trying to understand ordered and unordered map
I understand that ordered map will sort your list and unordered map will not sort it. base from this site
But why is this code get sorted anyway
std::unordered_map<std::string, int> map;
map.insert(std::make_pair("hello", 1));
map.insert(std::make_pair("world", 2));
map.insert(std::make_pair("call", 3));
map.insert(std::make_pair("ZZ", 4));
for(auto it = map.begin(); it != map.end();it++)
{
std::cout << it->first << std::endl;
}
result:
world
hello
call
ZZ
I dont get it. its supposed to be hello then world then call then ZZ. How did the world came first if its unordered container is not suppose to sort it.
std::hash<std::string> produces hashes for your input keys. Then, a transformation is applied to these hashes to get a valid index to a bucket in your map (The most basic one one can think of is a simple modulo). These operations take constant time and give you the location of your items. In order to get the good item from a transformed hash, you need it to be at the right place. That is why you observe an apparently random ordering of your inputs.
Now there is NO assumption you can make on the order of the computed indices. Suppose you would add some more strings in your map. Maybe a rehash would be needed for some strings and your input strings could appear in yet another order.
This has nothing to do with the order of insertion which has no reason to be kept, the purpose of the data structure is to place the items at the appropriate location so that their retreival can be made in amortized constant time (here by placing them at the indices given by the transformed hashes).
Probably, the best answer to this question is to say: "study a little about data structures.
Unordered map is supposed to store the data into a special data structure specially designed to be efficient but without any restrictions on storage order, and is therefore "unorderer".
Note that if you get the order of insertion is a type of restriction about the order.
Check out this data structure: hash table.
What they mean by "unordered", is that the elements are returned in a random order (based on element hash).

hash table with 2 items as a key?

There is a great example of a simple hash table in C++ for a single key, but I would like to hash on a <int, double> combination so that, for example h[5, 0.1] will return a double. Is this possible?
One possible method to get around this is to create an array of unordered_maps, and then have the key be a double. So, for example, I could simply call h[5][0.1] and get the double value back. In this the best way to go about this or can I create a multi-variabled key?
Sure. Make your key std::pair<int, double> (or tuple of <int, double>). Define appropriate hashing function (I would say hash(int) ^ hash(double) might work)

C++: insert into std::map without knowing a key

I need to insert values into std::map (or it's equivalent) to any free position and then get it's key (to remove/modify later). Something like:
std::map<int, std::string> myMap;
const int key = myMap.insert("hello");
Is it possibly to do so with std::map or is there some appropriate container for that?
Thank you.
In addition to using a set, you can keep a list of allocated (or free)
keys, and find a new key before inserting. For a map indexed by
int, you can simply take the last element, and increment its key. But
I rather think I'd go with a simple std::vector; if deletion isn't
supported, you can do something simple like:
int key = myVector.size();
myVector.push_back( newEntry );
If you need to support deletions, then using a vector of some sort of
"maybe" type (boost::optional, etc.—you probably already have
one in your toolbox, maybe under the name of Fallible or Maybe) might be
appropriate. Depending on use patterns (number of deletions compared to
total entries, etc.), you may want to search the vector in order to
reuse entries. If your really ambitious, you could keep a bitmap of the
free entries, setting a bit each time you delete and entry, and
resetting it whenever you reuse the space.
You can add object to an std::set, and then later put the whole set into a map. But no, you can't put a value into a map without a key.
The closest thing to what you're trying to do is probably
myMap[myMap.size()] = "some string";
The only advantage this has over std::set is that you can pass the integer indexes around to other modules without them needing to know the type of std::set<Foo>::iterator or similar.
It is impossible. Such an operation would require intricate knowledge of the key type to know which keys are available. For example, std::map would have to increment int values for int maps or append to strings for string maps.
You could use a std::set and drop keying altogether.
If you want to achieve something similar to automatically generated primary keys in SQL databases than you can maintain a counter and use it to generate a unique key. But perhaps std::set is what you really need.

C++ making Hash of Arrays

I have a problem with the creation of a hash of arrays. I need a Single Key - Multi Data system:
multimap <Type, vector<type> > var;
But how I can add elements to the vector?
Example: key = 3;
Now I need to append some elements into the vector whose key is 3.
Creating a temp-vector not an answer because I don't know when I need to input element into the vector with the current key.
sorry, understand my problem. i need fast-access struct, that will be operate with ~50,000 words with length ~20 each.
and i need something like tree.
also, have question:
how quick STL-structures, like vector,map,multimap and other?
What's wrong with std::map <KeyType, std::vector<SomeType> >, or some other collection as the value type? This gives you control over how to operate on the value collection. A multimap to me seems like a low-level form of std::map <KeyType, std::list<SomeType> >, but with none of the flexibility of a list.
To find the answer to your question you can look at the slides under point 6. at this site https://ece.uwaterloo.ca/~ece250/Lectures/Slides/
Hope that helps!

How to associate to a number another number without using array

Let's say we have read these values:
3
1241
124515
5322353
341
43262267234
1241
1241
3213131
And I have an array like this (with the elements above):
a[0]=1241
a[1]=124515
a[2]=43262267234
a[3]=3
...
The thing is that the elements' order in the array is not constant (I have to change it somewhere else in my program).
How can I know on which position does one element appear in the read document.
Note that I can not do:
vector <int> a[1000000000000];
a[number].push_back(all_positions);
Because a will be too large (there's a memory restriction). (let's say I have only 3000 elements, but they're values are from 0 to 2^32)
So, in the example above, I would want to know all the positions 1241 is appearing on without iterating again through all the read elements.
In other words, how can I associate to the number "1241" the positions "1,6,7" so I can simply access them in O(1) (where 1 actually is the number of positions the element appears)
If there's no O(1) I want to know what's the optimal one ...
I don't know if I've made myself clear. If not, just say it and I'll update my question :)
You need to use some sort of dynamic array, like a vector (std::vector) or other similar containers (std::list, maybe, it depends on your needs).
Such data structures are safer and easier to use than C-style array, since they take care of memory management.
If you also need to look for an element in O(1) you should consider using some structures that will associate both an index to an item and an item to an index. I don't think STL provides any, but boost should have something like that.
If O(log n) is a cost you can afford, also consider std::map
You can use what is commonly refered to as a multimap. That is, it stores Key and multiple values. This is an O(log) look up time.
If you're working with Visual Studios they provide their own hash_multimap, else may I suggest using Boost::unordered_map with a list as your value?
You don't need a sparse array of 1000000000000 elements; use an std::map to map positions to values.
If you want bi-directional lookup (that is, you sometimes want "what are the indexes for this value?" and sometimes "what value is at this index?") then you can use a boost::bimap.
Things get further complicated as you have values appearing more than once. You can sacrifice the bi-directional lookup and use a std::multimap.
You could use a map for that. Like:
std::map<int, std::vector<int>> MyMap;
So everytime you encounter a value while reading the file, you append it's position to the map. Say X is the value you read and Y is the position then you just do
MyMap[X].push_back( Y );
Instead of you array use
std::map<int, vector<int> > a;
You need an associative collection but you might want to associated with multiple values.
You can use std::multimap< int, int >
or
you can use std::map< int, std::set< int > >
I have found in practice the latter is easier for removing items if you just need to remove one element. It is unique on key-value combinations but not on key or value alone.
If you need higher performance then you may wish to use a hash_map instead of map. For the inner collection though you will not get much performance in using a hash, as you will have very few duplicates and it is better to std::set.
There are many implementations of hash_map, and it is in the new standard. If you don't have the new standard, go for boost.
It seems you need a std::map<int,int>. You can store the mapping such as 1241->0 124515->1 etc. Then perform a look up on this map to get the array index.
Besides the std::map solution offered by others here (O(log n)), there's the approach of a hash map (implemented as boost::unordered_map or std::unordered_map in C++0x, supported by modern compilers).
It would give you O(1) lookup on average, which often is faster than a tree-based std::map. Try for yourself.
You can use a std::multimap to store both a key (e.g. 1241) and multiple values (e.g. 1, 6 and 7).
An insert has logarithmic complexity, but you can speed it up if you give the insert method a hint where it can insert the item.
For O(1) lookup you could hash the number to find its entry (key) in a hash map (boost::unordered_map, dictionary, stdex::hash_map etc)
The value could be a vector of indices where the number occurs or a 3000 bit array (375 bytes) where the bit number for each respective index where the number (key) occurs is set.
boost::unordered_map<unsigned long, std::vector<unsigned long>> myMap;
for(unsigned long i = 0; i < sizeof(a)/sizeof(*a); ++i)
{
myMap[a[i]].push_back(i);
}
Instead of storing an array of integer, you could store an array of structure containing the integer value and all its positions in an array or vector.