What is an idiomatic way in C++ to have an object store that can be searched with respect to two keys? Essentially what I would like is to store things of type A in a binary search tree (BST) with the BST constructed using the order relation on A.key. However, each A also has a unique A.otherval and I essentially need to delete keys based on this value.
In C I would typically just have a BST with parent pointers and a hash table based with the other values as a key storing pointers to nodes of the BST. I can delete keys through the hash table by getting the node and calling tree delete on that node.
I'm looking for how to do this correctly using STL containers.
If I got the question correctly, all you need is a map in a map, so
std::map<first_key_type, std::map<second_key_type, value_type>> map;
map[key1][key2] = something;
Edit:
I assume that all the values that have the same first key are the same, and the second key is only used as an additional search/remove criteria. In that case, to get the value by the first key only you can use something like
map.at(key).cbegin()->second;
I would recommend two maps, one to map the key to the instance of A (the primary map), and another to map the otherVal to the key:
typedef ... Key;
typedef ... OtherVal;
struct A { Key key; OtherVal otherVal; ... };
typedef std::map<Key,A> KeyToAMap;
typedef std::map<OtherVal,Key> OtherValToKeyMap;
KeyToAMap keyToAMap;
OtherValToKeyMap otherValToKeyMap;
This way you can work with keyToAMap without any additional complexity, but when it comes time to delete, you just need an additional lookup.
To ease the usage, I would also recommend writing functions for wrapping insertion and deletion in both maps:
void insertNewA(const A& a) {
keyToAMap.insert(std::make_pair(a.key, a ));
otherValToKeyMap.insert(std::make_pair(a.otherVal, a.key ));
}
void deleteByOtherVal(const OtherVal& otherVal) {
OtherValToKeyMap::iterator it1 = otherValToKeyMap.find(otherVal);
if (it1 == otherValToKeyMap.end()) { /* error */ }
Key& key = it1->second;
KeyToAMap::iterator it2 = keyToAMap.find(key);
if (it2 == keyToAMap.end()) { /* error */ }
keyToAMap.erase(it2);
otherValToKeyMap.erase(it1);
}
An advantage of this solution is it only requires two maps, as opposed to a multi-level map solution, which requires 1+N maps, where N is the number of entries in the primary map.
Related
I have a Java program that I want to convert it to C++. So, there is a Linkedhashmap data structure used in the Java code and I want to convert it to C++. Is there an equivalent datatype for LinkedHashmap in C++?
I tried to use std::unordered_map, however, it does not maintain the order of the insertion.
C++ does not offer a collection template with the behavior that would mimic Java's LinkedHashMap<K,V>, so you would need to maintain the order separately from the mapping.
This can be achieved by keeping the data in a std::list<std::pair<K,V>>, and keeping a separate std::unordered_map<k,std::list::iterator<std::pair<K,V>>> map for quick look-up of the item by key:
On adding an item, add the corresponding key/value pair to the end of the list, and map the key to the iterator std::prev(list.end()).
On removing an item by key, look up its iterator, remove it from the list, and then remove the mapping.
On replacing an item, look up list iterator from the unordered map first, and then replace its content with a new key-value pair.
On iterating the values, simply iterate std::list<std::pair<K,V>>.
The insertion order contract on key iteration can be achieved with a balanced tree for log(n) performance. This is better than maintaining keys in a list as item removal requires n lookup time. My mantra is never put something you look up in a list. If it doesn't have to be sorted, use a hash. If it should be sorted, use a balanced tree. If all you're going to do is iterate, then a list is fine.
In c++ this would be std::map where the key is the item reference and the value is the insertion order, the keys are sorted using red-black trees. See: Is there a sorted container in STL
This is how I do it:
map<TKey, set<MyClass<K1,K2>, greater<MyClass<K1, K2>>>> _objects; // set ordered by timestamp. Does not guarantee uniqueness based on K1 and K2.
map<TKey, map<K2, typename set<MyClass<K1, K2>, greater<MyClass<K1, K2>>>::iterator>> _objectsMap; // Used to locate object in _objects
To add object id:
if (_objectsMap[userId].find(id) == _objectsMap[userId].end())
_objectsMap[userId][id] = _objects[userId].emplace(userId, id).first;
To erase an object id:
if (_objectsMap[userId].find(id) != _objectsMap[userId].end()) {
_objects[userId].erase(_objectsMap[userId][id]);
_objectsMap[userId].erase(id);
}
To retrieve, say the most recent size objects from the list starting from a specific object id:
vector<K2> result;
if (_objectsMap[userId].find(id) != _objectsMap[userId].end() && _objectsMap[userId][id] != _objects[userId].begin()) {
set<MyClass<K2, K2>, greater<MyClass<K1, K2>>>::iterator start = _objects[userId].begin(), end = _objectsMap[userId][id];
size_t counts = distance(_objects[userId].begin(), _objectsMap[userId][id]);
if (counts > size)
advance(start, counts - size);
transform(start,
end,
back_inserter(result),
[](const MyClass<K1, K2>& obj) { return obj.ID(); });
}
return result;
So i'm fairly new to c++ and i'm a little confused on how to implement the find() function on my set which is storing a pair item. Ive read on how to insert and remove items by their pair but came up dry on anyone explaining how to use find (Or some other method, if there is one) to find a value by the first item of the pair.
set<pair<string, CustomObject>> *items = new set<pair<string, CustomObject>>();
Then lets say I insert a few pairs into the set then I want to find one of those pair by searching for the "key" being stored as the first item in the pair. I think it would involve calling the .first on the pair but im just having trouble with that. This is the basic function im trying to implement
bool inSet(string key){
return this->items->find(pair<string, CustomObject>(key, null).first)
}
I was able to implement everything just fine in a map object but then I had to switch to a set because I wanted to be able to sort the items in the data structure and I was told that you cant efficiently do this in a map, hence the set.
std::set stores and searches for values based on the entire value. So when you do a find for pair(key, null), and the set contains pair(key, somevalue), it won't find it, as they are not the same.
If you want to search by just the key, you need a std::map. As you say, that doesn't do any searching or sorting by the value, so you can only have one entry with a given key.
If you want to search/sort by both just the key and the key,value pair (different searches at different points in the lifetime of the same data structure), then you'll need a more complex arrangement.
a map of sets can do what you want:
std::map<string, std::set<CustomObject>> items;
Now when you just want to look up things by key, you just lookup in the map, getting back a set of all the values with that key. If you want to search further for a specific value, you look it up in that set.
To find key in std::set by key stored in pair you need to redefine order comparision procedure for your set (if you need multiple objects use multiset):
typedef pair<string, CustomObject> SetValue
struct CustomObjectCompare {
bool operator() (const SetValue& lhs, const SetValue& rhs) const{
return rhs.first < rhs.first;
}
};
// use multiset insead of set if you need multiple objects per one key
typedef set<pair<string, CustomObject>, CustomObjectCompare> Set;
Set mySet;
bool inSet(string key){
static CustomObject emptyObject;
return mySet.end() != mySet.find(SetValue(key, emptyObject))
}
This example define comparision object CustomObjectCompare and special set class Set with that comparision object. As search as sorting will be only by string. The function isSet search by string and emptyObject is ignored and may be any of existed object. In example it is an function internal once initialized static object.
I've tried to implement hash table using vector. My table size will be defined in the constructor, for example lets say table size is 31, to create hash table I do followings:
vector<string> entires; // it is filled with entries that I'll put into hash table;
vector<string> hashtable;
hashtable.resize(31);
for(int i=0;i<entries.size();i++){
int index=hashFunction(entries[i]);
// now I need to know whether I've already put an entry into hashtable[index] or not
}
Is there anyone to help me how could I do that ?
Each cell in your hashtable comes with a bit of extra packaging.
If your hash allows deletions you need a state such that a cell can be marked as "deleted". This enables your search to continue looking even if it encounters this cell which has no actual value in it.
So a cell can have 3 states, occupied, empty and deleted.
You might also wish to store the hash-value in the cell. This is useful when you come to resize the table as you don't need to rehash all the entries.
In addition it can be an optimal first-comparison because comparing two numbers is likely to be quicker than comparing two objects.
These are considerations if this is an exercise, or if you find that std::unordered_map / std::unordered_set is not adequate for your purpose or if those are not available to you.
For practical purpose, at least try using those first.
It is possible to have several items for the same hash value
You just need to define your hash-table like this:
vector<vector<string>> hashtable;
hashtable.resize(32); //0-31
for(int i=0;i<entries.size();i++){
int index=hashFunction(entries[i]);
hashtable[index].push_back(entries[i]);
}
the simple implementation of hash table uses vector of pointers to actual entries:
class hash_map {
public:
iterator find(const key_type& key);
//...
private:
struct Entry { // representation
key_type key;
mepped_type val;
Entry* next; // hash overflow link
};
vector<Entry> v; // the actual entries
vector<Entry*> b; // the hash table, pointers into v
};
to find a value operator uses a hash function to find an index in the hash table for the key:
mapped_type& hash_map::operator[](const key_type& k) {
size_type i = hash(k)%b.size(); // hash
for (Entry* p=b[i];p;p=p->next) // search among entries hashed to i
if (eq(k,p->key)) { // found
if (p->erased) { // re-insert
p->erased=false;
no_of_erased--;
return p->val=default_value;
}
// not found, resize if needed
return operator[](k);
v.push_back(Entry(k,default_value,b[i])); // add Entry
b[i]=&v.back(); // point to new element
return b[i]->val;
}
In a C++ std::map, is there any way to search for the key given the mapped value? Example:
I have this map:
map<int,string> myMap;
myMap[0] = "foo";
Is there any way that I can find the corresponding int, given the value "foo"?
cout << myMap.some_function("foo") <<endl;
Output: 0
std::map doesn't provide a (fast) way to find the key of a given value.
What you want is often called a "bijective map", or short "bimap". Boost has such a data structure. This is typically implemented by using two index trees "glued" together (where std::map has only one for the keys). Boost also provides the more general multi index with similar use cases.
If you don't want to use Boost, if storage is not a big problem, and if you can affort the extra code effort, you can simply use two maps and glue them together manually:
std::map<int, string> myMapForward;
std::map<string, int> myMapBackward; // maybe even std::set
// insertion becomes:
myMapForward.insert(std::make_pair(0, "foo"));
myMapBackward.insert(std::make_pair("foo", 0));
// forward lookup becomes:
myMapForwar[0];
// backward lookup becomes:
myMapBackward["foo"];
Of course you can wrap those two maps in a class and provide some useful interface, but this might be a bit overkill, and using two maps with the same content is not an optional solution anyways. As commented below, exception safety is also a problem of this solution. But in many applications it's already enough to simply add another reverse map.
Please note that since std::map stores unique keys, this approach will support backward lookup only for unique values, as collisions in the value space of the forward map correspond to collisions in the key space of the backward map.
No, not directly.
One option is to examine each value in the map until you find what you are looking for. This, obviously, will be O(n).
In order to do this you could just write a for() loop, or you could use std::find_if(). In order to use find_if(), you'll need to create a predicate. In C++11, this might be a lambda:
typedef std::map <unsigned, Student> MyMap;
MyMap myMap;
// ...
const string targetName = "Jones";
find_if (myMap.begin(), myMap.end(), [&targetName] (const MyMap::value_type& test)
{
if (test.second.mName == targetName)
return true;
});
If you're using C++03, then this could be a functor:
struct MatchName
: public std::unary_function <bool, MyMap::value_type>
{
MatchName (const std::string& target) : mTarget (target) {}
bool operator() (const MyMap::value_type& test) const
{
if (test.second.mName == mTarget)
return true;
return false;
}
private:
const std::string mTarget;
};
// ...
find_if (myMap.begin(), myMap.end(), MatchName (target));
Another option is to build an index. The index would likely be another map, where the key is whatever values you want to find and the value is some kind of index back to the main map.
Suppose your main map contains Student objects which consist of a name and some other stuff, and the key in this map is the Student ID, an integer. If you want to find the student with a particular last name, you could build an indexing map where the key is a last name (probably want to use multimap here), and the value is the student ID. You can then index back in to the main map to get the remainder of the Student's attributes.
There are challenges with the second approach. You must keep the main map and the index (or indicies) synchronized when you add or remove elements. You must make sure the index you choose as the value in the index is not something that may change, like a pointer. If you are multithreading, then you have to give a think to how both the map and index will be protected without introducing deadlocks or race conditions.
The only way to accomplish this that I can think of is to iterate through it. This is most likely not what you want, but it's the best shot I can think of. Good luck!
No, You can not do this. You simply have to iterate over map and match each value with the item to be matched and return the corresponding key and it will cost you high time complexity equal to O(n).
You can achieve this by iterating which will take O(n) time. Or you can store the reverse map which will take O(n) space.
By iterating:
std::map<int, string> fmap;
for (std::map<int,string>::iterator it=fmap.begin(); it!=fmap.end(); ++it)
if (strcmp(it->second,"foo"))
break;
By storing reverse map:
std::map<int, string> fmap;
std::map<string, int> bmap;
fmap.insert(std::make_pair(0, "foo"));
bmap.insert(std::make_pair("foo", 0));
fmap[0]; // original map lookup
bmap["foo"]; //reverse map lookup
I would like to access/iterate over all non-unique keys in an unordered_multimap.
The hash table basically is a map from a signature <SIG> that does indeed occur more than once in practice to identifiers <ID>. I would like to find those entries in the hash table where occurs once.
Currently I use this approach:
// map <SIG> -> <ID>
typedef unordered_multimap<int, int> HashTable;
HashTable& ht = ...;
for(HashTable::iterator it = ht.begin(); it != ht.end(); ++it)
{
size_t n=0;
std::pair<HashTable::iterator, HashTable::iterator> itpair = ht.equal_range(it->first);
for ( ; itpair.first != itpair.second; ++itpair.first) {
++n;
}
if( n > 1 ){ // access those items again as the previous iterators are not valid anymore
std::pair<HashTable::iterator, HashTable::iterator> itpair = ht.equal_range(it->first);
for ( ; itpair.first != itpair.second; ++itpair.first) {
// do something with those items
}
}
}
This is certainly not efficient as the outer loop iterates over all elements of the hash table (via ht.begin()) and the inner loop tests if the corresponding key is present more than once.
Is there a more efficient or elegant way to do this?
Note: I know that with a unordered_map instead of unordered_multimap I wouldn't have this issue but due to application requirements I must be able to store multiple keys <SIG> pointing to different identifiers <ID>. Also, an unordered_map<SIG, vector<ID> > is not a good choice for me as it uses roughly 150% of memory as I have many unique keys and vector<ID> adds quite a bit of overhead for each item.
Use std::unordered_multimap::count() to determine the number of elements with a specific key. This saves you the first inner loop.
You cannot prevent iterating over the whole HashTable. For that, the HashTable would have to maintain a second index that maps cardinality to keys. This would introduce significant runtime and storage overhead and is only usefull in a small number of cases.
You can hide the outer loop using std::for_each(), but I don't think it's worth it.
I think that you should change your data model to something like:
std::map<int, std::vector<int> > ht;
Then you could easily iterate over map, and check how many items each element contains with size()
But in this situation building a data structure and reading it in linear mode is a little bit more complicated.