I can get all elements in a single bucket with this code:
typedef boost::unordered_multimap< key, myClass*, MyHash<key> >
HashMMap;
HashMMap::iterator it;
it = hashMMap_.find( someKey);
int bucketIndex = hashMMap_.bucket( someKey);
int bucketSize = hashMMap_.bucket_size( bucketIndex);
qDebug() << "index of bucket with key:" << someKey << " is:"
<< bucketIndex;
qDebug() << "number of elements in bucket with index:" << bucketIndex << " is:"
<< bucketSize;
HashMMap::local_iterator lit;
/* begin of bucket with index bucketIndex */
lit = hashMMap_.begin( bucketIndex);
for ( ; lit != sender_.hashMMap_.end( bucketIndex); ++lit) {
qDebug() << "(*lit).first:" << (*lit).first << ", (*lit).second:" <<
(*lit).second << ", (*lit).second->something_:" <<
(*lit).second->something_;
}
I would like to get a local_iterator to the first element in a bucket and iterate over it till the bucket end, so if there is only one value for a given index in hash table (where index is the Hash(key)) I will iterate just through a single element and receive bucket end(), and in case of many elements I will iterate whole bucket (all values with equal hash). is this possible without bucketIndex, hashMMap_.begin( bucketIndex) and hashMMap_.end( bucketIndex) ?
so basically I would like to get a local_iterator like this:
HashMMap::local_iterator lit = hashMMap_.find_bucket_if_present( someKey);
Additional question is: do I have to test first if find() returns an iterator to element before calling int bucketIndex = hashMMap_.bucket( someKey) ? This is what I think because explanation of bucket() function from boost site is:
Returns: The index of the bucket which would contain an element with
key k.
^^^
I think this means I have first to find(key) in the multimap to know if key is present, because a call to bucket(key) will return an index which is not a hash but modulo of hash (bucket_from_hash) in the hash table under which key is stored if it is present. So because of the modulo which is done with bucket_count, if key was not inserted I will iterate over a virtual bucket in which it would be under current circumstances, and what is most important for me: also different hashes could be there as the bucket_count might be less than my hash (I use 16-bit MyHash<key> of 32-bit key as a hash function provided into multimap constructor). Is this correct?
I would start working with ranges, like so:
template<typename BoostUnorderedMap, typename Key>
boost::iterator_range< typename BoostUnorderedMap::local_iterator > get_bucket_range( BoostUnorderedMap& myMap, Key const& k ) {
int bucketIndex = myMap.bucket( k );
return boost::iterator_range< typename BoostUnorderedMap::local_iterator >(
myMap.begin(bucketIndex),
myMap.end(bucketIndex)
}
}
template<typename BoostUnorderedMap, typename Key>
boost::iterator_range< typename BoostUnorderedMap::local_const_iterator > get_bucket_range( BoostUnorderedMap const& myMap, Key const& k ) {
int bucketIndex = myMap.bucket( k );
return boost::iterator_range< typename BoostUnorderedMap::local_const_iterator >(
myMap.begin(bucketIndex),
myMap.end(bucketIndex)
}
}
then, at least in C++11, you can do the following:
for (auto && entry : get_bucket_range( some_map, "bob" ) )
and it iterates over everything in the "bob" bucket.
While this does use bucketIndex, it hides these details from the end consumer, and simply gives you a boost::range instead.
Related
In an array map<string, int> bannd such that each key (of type string) holds a number value, like this
+++++++++++++++
key | value
+++++++++++++++
red | 0
blue | 1
orange| 3
etc...
What is the optimal way to return the value of an index using the key?
I already tried using find like this
band1 = band.find("a");
where a is the key value in the map, but it does not seem to be working.
find returns an iterator pointing to the found key-value pair (if any). You have to dereference that iterator to get the actual mapped value:
int band1;
auto it = band.find("a");
if (it != band.end())
band1 = it->second;
else
/* not found ... */;
Note that *it just gives us the std::pair containing the key and mapped value together. To access the mapped value itself we use it->second.
Alternatively, if you know that the key is in the map, you can use at to get the mapped value for that key:
int band1 = band.at("a");
at will throw an out_of_range exception if the element is not found.
Finally, if you want to access the value with key "a" and you want to automatically add that key to the map if it is not already there, you can use the subscript operator []:
int band1 = band["a"]; //warning: inserts {a, 0} into the map if not found!
Write a function, which takes std::map and std::vector of key as argument. And it will return the corresponding values in std::vector
vector<int> valueReturn(map<string,int> data, vector<string> key) {
vector<int> value;
for(const auto& it: key) {
auto search = data.find(it);
if(search != data.end()) {
value.push_back(data[it]);
std::cout << "Found " << search->first << " " << search->second << '\n';
}
else {
value.push_back(-1); // Inserting -1 for not found value, You can insert some other values too. Which is not used as value
std::cout << "Not found\n";
}
}
return value;
}
int band1 = band["a"];
int band2 = band["b"];
int band3 = band["c"];
int band4 = band["d"];
I have a map input which contains a list of words and their counts.
I use this function to print the map input:
template <class KTy, class Ty>
void PrintMap(map<KTy, Ty> map)
{
typedef std::map<KTy, Ty>::iterator iterator;
for (iterator p = map.begin(); p != map.end(); p++)
cout << p->first << ": " << p->second << endl;
}
it prints the values likes this:
you : 296
she : 14
go : 29
how can I print it in descending order of word count.
Try the following:
// Copy it into a vector.
std::vector<std::pair<std::string,int>> vector( map.begin(), map.end() );
// Sort the vector according to the word count in descending order.
std::sort( vector.begin(), vector.end(),
[]( const auto & lhs, const auto & rhs )
{ return lhs.second > rhs.second; } );
// Print out the vector.
for ( const auto & item : vector )
std::cout << item.first << ": " << item.second << std::endl;
The map is sorted according to the keys, i. e. alphabetically. You cannot change that behaviour. Therefore, the easiest way to get the job done is copying it into a vector and sorting it with a user defined compare function.
map stores elements sorted by key not by value. Unless you want to change the type of your map, the only way to do it is to sort your elements after getting them out. I think the easiest way to do so is via std::sort
Given that dereferencing an iterator into a map gives a const value_type&, we can take advantage of the reference to avoid actually creating copies of value_type (which is a std::pair<Key, Value>). However, the code will be a little longer than if we wanted to copy:
template <class KTy, class Ty>
void PrintMap(const std::map<KTy, Ty>& map)
{
using vt = const typename std::map<KTy, Ty>::value_type*;
std::vector<vt> vec(map.size());
size_t i = 0;
for(const auto& keyval : map)
{
vec[i++] = &keyval;
}
std::sort(std::begin(vec), std::end(vec), [](vt _lhs, vt _rhs){return _lhs->second > _rhs->second;});
for(const auto& el : vec)
std::cout << el->first << ": " << el->second << std::endl;
}
With a test
std::map<std::string, int> myMap{{"you", 296}, {"she", 14}, {"go", 29}};
PrintMap(myMap);
Outputs
you: 296
go: 29
she: 14
This should be much faster if your map is of non-trivially copyable elements.
Put data in another container, sort by word count, print.
My goal is to map elements of a type to other elements of the same type. Suppose they are size_t for simplicity.
std::map<size_t, size_t> myMapping;
This would do it, but if I want to follow a bunch of such links (they are all the same map), each step is a log(n) lookup.
size_t k = /*whatever*/;
myMapping[myMapping[myMapping[k]]]; //3 * log(n)
I want to make use of the fact that map iterators remain valid and have a map that maps size_t to iterators into itself.
typedef /*myMapTemplate*/::iterator map_iter;
std::map<size_t, map_iter> myMapping;
size_t k = /*whatever*/
map_iter entryPoint = myMapping.find(k);
entryPoint->second->second->first; //log(n) + 2 constant time operations
How would I write this type?
I know copying would keep iterators to old map and plan to take care of this myself.
I understand your question that you want map: key->map<key,>::iterator
So, here it is, a struct with map iterator as value:
template <
template <class K, class V, class C, class A> class mapImpl,
class K,
class V,
class C=std::less<K>,
class A=std::allocator<std::pair<const K, V> >
>
class value_with_iterator {
public:
typedef typename mapImpl<const K,value_with_iterator,C,A>::iterator value_type;
value_type value;
};
Map defined with using struct above:
typedef std::map<size_t, value_with_iterator <std::map, size_t, size_t> > map_size_t_to_itself;
Some insert method - to link key with itself:
map_size_t_to_itself::iterator insert(map_size_t_to_itself& mapRef, size_t value)
{
map_size_t_to_itself::value_type v(value, map_size_t_to_itself::mapped_type());
std::pair<map_size_t_to_itself::iterator, bool> res = mapRef.insert(v);
if (res.second)
res.first->second.value = res.first;
return res.first;
}
And simple test:
int main() {
map_size_t_to_itself mapObj;
map_size_t_to_itself::iterator i1 = insert(mapObj, 1);
map_size_t_to_itself::iterator i2 = insert(mapObj, 1);
map_size_t_to_itself::iterator i3 = insert(mapObj, 2);
std::cout << i1->first << ": " << i1->second.value->first << std::endl;
std::cout << i2->first << ": " << i2->second.value->first << std::endl;
std::cout << i3->first << ": " << i3->second.value->first << std::endl;
}
with OUTPUT:
1: 1
1: 1
2: 2
Full link: http://ideone.com/gnEhw
If I understood your problem correctly, I think I would keep my elements in a vector and use a vector of indices into the first vector for the kind of indirection you want. If you also need ordered access you can always throw in a map to the elements of the first vector.
This is probably really simple, but I can't find a simple example for it.
I understand that with a hash_multimap you can have several values mapped to a single key. But how exactly would I access those values. All the examples I stumbled across always just access the first value mapped to the the key. Heres an example of what I mean
key : value
1 : obj1a;
2 : obj2a, obj2b, obj2c
how would I access obj2b and obj2c, not just obj2a
The usual multimap iteration loop is like this:
#include <unordered_multimap>
typedef std::unordered_multimap<K, V> mmap_t;
mmap_t m;
for (mmap_t::const_iterator it1 = m.begin(), it2 = it1, end = m.end(); it1 != end; it1 = it2)
{
// outer loop over unique keys
for ( ; it1->first == it2->first; ++it2)
{
// inner loop, all keys equal to it1->first
}
}
To iterate over just one key value, use equal_range instead.
std::pair<mmap_t::const_iterator, mmap_t::const_iterator> p = m.equal_range(key);
for (mmap_t::const_iterator it = p.first; it != p.second; ++it)
{
// use "it->second"
}
For example, equal_range returns an two iterators, to the begin and end of the matching range :
void lookup(const map_type& Map, int key)
{
cout << key << ": ";
pair<map_type::const_iterator, map_type::const_iterator> p =
Map.equal_range(key);
for (map_type::const_iterator i = p.first; i != p.second; ++i)
cout << (*i).second << " ";
cout << endl;
}
where we're using a map_type like
class ObjectT; // This is the type of object you want to store
typedef hash_multimap<int, ObjectT> map_type;
Just grab an iterator to the first one and increment it. If the keys are still equal, you've got another entry with the same key value. You can also use equal_range.
I have a large(ish - >100K) collection mapping a user identifier (an int) to the count of different products that they've bought (also an int.) I need to re-organise the data as efficiently as possible to find how many users have different numbers of products. So for example, how many users have 1 product, how many users have two products etc.
I have acheived this by reversing the original data from a std::map into a std::multimap (where the key and value are simply reversed.) I can then pick out the number of users having N products using count(N) (although I also uniquely stored the values in a set so I could be sure of the exact number of values I was iterating over and their order)
Code looks like this:
// uc is a std::map<int, int> containing the original
// mapping of user identifier to the count of different
// products that they've bought.
std::set<int> uniqueCounts;
std::multimap<int, int> cu; // This maps count to user.
for ( map<int, int>::const_iterator it = uc.begin();
it != uc.end(); ++it )
{
cu.insert( std::pair<int, int>( it->second, it->first ) );
uniqueCounts.insert( it->second );
}
// Now write this out
for ( std::set<int>::const_iterator it = uniqueCounts.begin();
it != uniqueCounts.end(); ++it )
{
std::cout << "==> There are "
<< cu.count( *it ) << " users that have bought "
<< *it << " products(s)" << std::endl;
}
I just can't help feeling that this is not the most efficient way of doing this. Anyone know of a clever method of doing this?
I'm limited in that I can't use Boost or C++11 to do this.
Oh, also, in case anyone is wondering, this is neither homework, nor an interview question.
Assuming you know the maximum number of products that a single user could have bought, you might see better performance just using a vector to store the results of the operation. As it is you're going to need an allocation for pretty much every entry in the original map, which likely isn't the fastest option.
It would also cut down on the lookup overhead on a map, gain the benefits of memory locality, and replace the call to count on the multimap (which is not a constant time operation) with a constant time lookup of the vector.
So you could do something like this:
std::vector< int > uniqueCounts( MAX_PRODUCTS_PER_USER );
for ( map<int, int>::const_iterator it = uc.begin();
it != uc.end(); ++it )
{
uniqueCounts[ uc.second ]++;
}
// Now write this out
for ( int i = 0, std::vector< int >::const_iterator it = uniqueCounts.begin();
it != uniqueCounts.end(); ++it, ++i )
{
std::cout << "==> There are "
<< *it << " users that have bought "
<< i << " products(s)" << std::endl;
}
Even if you don't know the maximum number of products, it seems like you could just guess a maximum and adapt this code to increase the size of the vector if required. It's sure to result in less allocations than your original example anyway.
All this is assuming that you don't actually require the user ids after you've processed this data of course (and as pointed out in the comments below, that the number of products bought for each user is a relatively small & contiguous set. Otherwise you might be better off using a map in place of a vector - you'll still avoid calling the multimap::count function, but potentially lose some of the other benefits)
It depends on what you mean by "more efficient". First off, is this really a bottle neck? Sure, 100k entries is a lot, but if you only have to this every few minutes, it's ok if the algorithm takes a couple seconds.
The only area for improvement I see is memory usage. If this is a concern, you can skip the generation of the multimap and just keep a counter map around, something like this (beware, my C++ is a little rusty):
std::map<int, int> countFrequency; // count => how many customers with that count
for ( std::map<int, int>::const_iterator it = uc.begin();
it != uc.end(); ++it )
{
// If it->second is not yet in countFrequency,
// the default constructor initializes it to 0.
countFrequency[it->second] += 1;
}
// Now write this out
for ( std::map<int, int>::const_iterator it = countFrequency.begin();
it != countFrequency.end(); ++it )
{
std::cout << "==> There are "
<< it->second << " users that have bought "
<< it->first << " products(s)" << std::endl;
}
If a user is added and buys count items, you can update countFrequency with
countFrequency[count] += 1;
If an existing user goes from oldCount to newCount items, you can update countFrequency with
countFrequency[oldCount] -= 1;
countFrequency[newCount] += 1;
Now, just as an aside, I recommend using an unsigned int for count (unless there's a legitimate reason for negative counts) and typedef'ing a userID type, for added readability.
If you can, I would recommend keeping both pieces of data current all the time. In other words, I would maintain a second map which is mapping number of products bought to number of customers who bought that many products. This map contains the exact answer to your question if you maintain it. Each time a customer buys a product, let n be the number of products this customer has now bought. Subtract one from the value at key n-1. Add one to the value at key n. If the range of keys is small enough this could be an array instead of a map. Do you ever expect a single customer to buy hundreds of products?
Just for larks, here's a mixed approach that uses a vector if the data is smallish, and a map to cover the case where one user has bought a truly absurd number of products. I doubt you'll really need the latter in a store app, but a more general version of the problem might benefit from it.
typedef std::map<int, int> Map;
typedef Map::const_iterator It;
template <typename Container>
void get_counts(const Map &source, Container &dest) {
for (It it = source.begin(); it != source.end(); ++it) {
++dest[it->second];
}
}
template <typename Container>
void print_counts(Container &people, int max_count) {
for (int i = 0; i <= max_count; ++i) {
if contains(people, i) {
std::cout << "==> There are "
<< people[i] << " users that have bought "
<< i << " products(s)" << std::endl;
}
}
}
// As an alternative to this overloaded contains(), you could write
// an overloaded print_counts -- after all the one above is not an
// efficient way to iterate a sparsely-populated map.
// Or you might prefer a template function that visits
// each entry in the container, calling a specified functor to
// will print the output, and passing it the key and value.
// This is just the smallest point of customization I thought of.
bool contains(const Map &c, int key) {
return c.count(key);
}
bool contains(const std::vector<int, int> &c, int key) {
// also check 0 < key < c.size() for a more general-purpose function
return c[key];
}
void do_everything(const Map &uc) {
// first get the max product count
int max_count = 0;
for (It it = uc.begin(); it != uc.end(); ++it) {
max_count = max(max_count, it->second);
}
if (max_count > uc.size()) { // or some other threshold
Map counts;
get_counts(uc, counts);
print_counts(counts, max_count);
} else {
std::vector<int> counts(max_count+1);
get_counts(uc, counts);
print_counts(counts, max_count);
}
}
From here you could refactor, to create a class template CountReOrderer, which takes a template parameter telling it whether to use a vector or a map for the counts.