std::map of iterators to itself - c++

My goal is to map elements of a type to other elements of the same type. Suppose they are size_t for simplicity.
std::map<size_t, size_t> myMapping;
This would do it, but if I want to follow a bunch of such links (they are all the same map), each step is a log(n) lookup.
size_t k = /*whatever*/;
myMapping[myMapping[myMapping[k]]]; //3 * log(n)
I want to make use of the fact that map iterators remain valid and have a map that maps size_t to iterators into itself.
typedef /*myMapTemplate*/::iterator map_iter;
std::map<size_t, map_iter> myMapping;
size_t k = /*whatever*/
map_iter entryPoint = myMapping.find(k);
entryPoint->second->second->first; //log(n) + 2 constant time operations
How would I write this type?
I know copying would keep iterators to old map and plan to take care of this myself.

I understand your question that you want map: key->map<key,>::iterator
So, here it is, a struct with map iterator as value:
template <
template <class K, class V, class C, class A> class mapImpl,
class K,
class V,
class C=std::less<K>,
class A=std::allocator<std::pair<const K, V> >
>
class value_with_iterator {
public:
typedef typename mapImpl<const K,value_with_iterator,C,A>::iterator value_type;
value_type value;
};
Map defined with using struct above:
typedef std::map<size_t, value_with_iterator <std::map, size_t, size_t> > map_size_t_to_itself;
Some insert method - to link key with itself:
map_size_t_to_itself::iterator insert(map_size_t_to_itself& mapRef, size_t value)
{
map_size_t_to_itself::value_type v(value, map_size_t_to_itself::mapped_type());
std::pair<map_size_t_to_itself::iterator, bool> res = mapRef.insert(v);
if (res.second)
res.first->second.value = res.first;
return res.first;
}
And simple test:
int main() {
map_size_t_to_itself mapObj;
map_size_t_to_itself::iterator i1 = insert(mapObj, 1);
map_size_t_to_itself::iterator i2 = insert(mapObj, 1);
map_size_t_to_itself::iterator i3 = insert(mapObj, 2);
std::cout << i1->first << ": " << i1->second.value->first << std::endl;
std::cout << i2->first << ": " << i2->second.value->first << std::endl;
std::cout << i3->first << ": " << i3->second.value->first << std::endl;
}
with OUTPUT:
1: 1
1: 1
2: 2
Full link: http://ideone.com/gnEhw

If I understood your problem correctly, I think I would keep my elements in a vector and use a vector of indices into the first vector for the kind of indirection you want. If you also need ordered access you can always throw in a map to the elements of the first vector.

Related

Best way to calculate a running hash for an unordered_map?

I've got a simple wrapper-class for std::unordered_map that updates a running hash-code for the unordered_map's contents, as key-value pairs are added or removed; that way I never have to iterate over the entire contents to get the current hash code for the set. It does this by adding to the _hash member-variable whenever a new key-value pair is added, and subtracting from the _hash member-variable whenever an existing key-value pair is removed. This all works fine (but see the toy implementation below if you want a code-example of what I mean).
My only concern is that I suspect that simply adding and subtracting values from _hash might not be the optimal thing to do from the perspective of minimizing the likelihood of hash-value collisions. Is there a mathematically better way to compute the running-hash-code for the table, that would still preserve my ability to efficiently add/remove items from the table (i.e. without forcing me to iterate over the table to rebuild a hash code from scratch every time?)
#include <functional>
#include <unordered_map>
#include <string>
#include <iostream>
template<typename KeyType, typename ValueType> class UnorderedMapWithHashCode
{
public:
UnorderedMapWithHashCode() : _hash(0) {/* empty */}
void Clear() {_map.clear(); _hash = 0;}
void Put(const KeyType & k, const ValueType & v)
{
Remove(k); // to deduct any existing value from _hash
_hash += GetHashValueForPair(k, v);
_map[k] = v;
}
void Remove(const KeyType & k)
{
if (_map.count(k) > 0)
{
_hash -= GetHashValueForPair(k, _map[k]);
_map.erase(k);
}
}
const std::unordered_map<KeyType, ValueType> & GetContents() const {return _map;}
std::size_t GetHashCode() const {return _hash;}
private:
std::size_t GetHashValueForPair(const KeyType & k, const ValueType & v) const
{
return std::hash<KeyType>()(k) + std::hash<ValueType>()(v);
}
std::unordered_map<KeyType, ValueType> _map;
std::size_t _hash;
};
int main(int, char **)
{
UnorderedMapWithHashCode<std::string, int> map;
std::cout << "A: Hash is " << map.GetHashCode() << std::endl;
map.Put("peanut butter", 5);
std::cout << "B: Hash is " << map.GetHashCode() << std::endl;
map.Put("jelly", 25);
std::cout << "C: Hash is " << map.GetHashCode() << std::endl;
map.Remove("peanut butter");
std::cout << "D: Hash is " << map.GetHashCode() << std::endl;
map.Remove("jelly");
std::cout << "E: Hash is " << map.GetHashCode() << std::endl;
return 0;
}
Your concept's perfectly fine, just the implementation could be improved:
you could take the hash functions to use as template arguments that default to the relevant std::hash instantiations; note that for numbers it's common (GCC, Clang, Visual C++) for std::hash<> to be an identity hash, which is moderately collision prone; GCC and Clang mitigate that somewhat by having prime number of buckets (vs Visual C++'s power-of-2 choice), but you need to avoid having distinct key,value entries collide in the size_t hash-value space, rather than post-mod-bucket-count, so would be better off using a meaningful hash function. Similarly Visual C++'s std::string hash only incorporates 10 characters spaced along the string (so it's constant time), but if your key and value were both similar same-length long strings only differing in a few characters that would be horrible collision prone too. GCC uses a proper hash function for strings - MURMUR32.
return std::hash<KeyType>()(k) + std::hash<ValueType>()(v); is mediocre idea in general and an awful idea when using an identity hash function (e.g. h({k,v}) == k + v, so h({4,2}) == h({2,4}) == h({1,5}) etc.)
consider using something based on boost::hash_combine instead (assuming you do adopt the above advice to have template parameters provide the hash functions:
auto key_hash = KeyHashPolicy(key);
return (key_hash ^ ValueHashPolicy(value)) +
0x9e3779b9 + (key_hash << 6) + (key_hash >> 2);
you could dramatically improve the efficiency of your operations by avoiding unnecessarily hash table lookups (your Put does 2-4 table lookups, and Remove does 1-3):
void Put(const KeyType& k, const ValueType& v)
{
auto it = _map.find(k);
if (it == _map.end()) {
_map[k] = v;
} else {
if (it->second == v) return;
_hash -= GetHashValueForPair(k, it->second);
it->second = v;
}
_hash += GetHashValueForPair(k, v);
}
void Remove(const KeyType& k)
{
auto it = _map.find(k);
if (it == _map.end()) return;
_hash -= GetHashValueForPair(k, it->second);
_map.erase(it);
}
if you want to optimise further, you can create a version of GetHashValueForPair that returned the HashKeyPolicy(key) value and let you pass it in to avoid hashing the key twice in Put.

Map reference confusion

I encountered with some weird problem. I have class which store its values inside map. But in one case I need to expose map to do some external calculation and possible adding of data inside that map.
And I have next problem. I have shared_ptr of that class and expose map through reference, but during processing map wont accept new data.
I wrote some dummy example of that just to be clear. What is happening here? And why?
Why changes made to map won't hold up after function end?
#include <map>
#include <iostream>
#include <memory>
class MapWrap {
public:
MapWrap() {}
~MapWrap(){}
std::map<int, int>& getMap() { return map; }
private:
std::map<int, int> map;
};
void goGo(std::shared_ptr<MapWrap> m){
auto map = m->getMap();
std::cout << "Func: before: map size: " << map.size() << std::endl;
for(int i = 0; i < 3; ++i){
// This should and will add new value to map.
if(map[i] == 3){
std::cout << "blah" << std::endl;
}
}
std::cout << "Func: after: map size: " << map.size() << std::endl;
}
int main(){
auto mapWrap = std::make_shared<MapWrap>();
for(int i = 0; i < 3; ++i){
goGo(mapWrap);
}
return 0;
}
EDIT: Removed const from getMap() method.
The problem is that here:
auto map = m->getMap();
type of map is std::map<int, int> so you make a copy and you modify this copy. Change it to :
auto& map = m->getMap();
and you will modify the passed map instead of copy.
btw. if you dont know what type your auto variable have, you can always use compiler errors to check this:
template<typename T> struct TD;
auto map = m->getMap();
TD<decltype(map)> dd;
will result in:
main.cpp:19:21: error: aggregate 'TD<std::map<int, int> > dd' has incomplete type and cannot be defined
TD<decltype(map)> dd;
here you can read map type is std::map<int, int>

print map values in descending order in c++

I have a map input which contains a list of words and their counts.
I use this function to print the map input:
template <class KTy, class Ty>
void PrintMap(map<KTy, Ty> map)
{
typedef std::map<KTy, Ty>::iterator iterator;
for (iterator p = map.begin(); p != map.end(); p++)
cout << p->first << ": " << p->second << endl;
}
it prints the values likes this:
you : 296
she : 14
go : 29
how can I print it in descending order of word count.
Try the following:
// Copy it into a vector.
std::vector<std::pair<std::string,int>> vector( map.begin(), map.end() );
// Sort the vector according to the word count in descending order.
std::sort( vector.begin(), vector.end(),
[]( const auto & lhs, const auto & rhs )
{ return lhs.second > rhs.second; } );
// Print out the vector.
for ( const auto & item : vector )
std::cout << item.first << ": " << item.second << std::endl;
The map is sorted according to the keys, i. e. alphabetically. You cannot change that behaviour. Therefore, the easiest way to get the job done is copying it into a vector and sorting it with a user defined compare function.
map stores elements sorted by key not by value. Unless you want to change the type of your map, the only way to do it is to sort your elements after getting them out. I think the easiest way to do so is via std::sort
Given that dereferencing an iterator into a map gives a const value_type&, we can take advantage of the reference to avoid actually creating copies of value_type (which is a std::pair<Key, Value>). However, the code will be a little longer than if we wanted to copy:
template <class KTy, class Ty>
void PrintMap(const std::map<KTy, Ty>& map)
{
using vt = const typename std::map<KTy, Ty>::value_type*;
std::vector<vt> vec(map.size());
size_t i = 0;
for(const auto& keyval : map)
{
vec[i++] = &keyval;
}
std::sort(std::begin(vec), std::end(vec), [](vt _lhs, vt _rhs){return _lhs->second > _rhs->second;});
for(const auto& el : vec)
std::cout << el->first << ": " << el->second << std::endl;
}
With a test
std::map<std::string, int> myMap{{"you", 296}, {"she", 14}, {"go", 29}};
PrintMap(myMap);
Outputs
you: 296
go: 29
she: 14
This should be much faster if your map is of non-trivially copyable elements.
Put data in another container, sort by word count, print.

C++ class specialiation when dealing with STL containers

I'd like a function to return the size in bytes of an object for fundamental types. I'd also like it to return the total size in bytes of an STL container. (I know this is not necessarily the size of the object in memory, and that's okay).
To this end, I've coded a memorysize namespace with a bytes function such that memorysize::bytes(double x) = 8 (on most compilers).
I've specialized it to correctly handle std::vector<double> types, but I don't want to code a different function for each class of the form std::vector<ANYTHING>, so how do I change the template to correctly handle this case?
Here's the working code:
#include <iostream>
#include <vector>
// return the size of bytes of an object (sort of...)
namespace memorysize
{
/// general object
template <class T>
size_t bytes(const T & object)
{
return sizeof(T);
}
/// specialization for a vector of doubles
template <>
size_t bytes<std::vector<double> >(const std::vector<double> & object)
{
return sizeof(std::vector<double>) + object.capacity() * bytes(object[0]);
}
/// specialization for a vector of anything???
}
int main(int argc, char ** argv)
{
// make sure it works for general objects
double x = 1.;
std::cout << "double x\n";
std::cout << "bytes(x) = " << memorysize::bytes(x) << "\n\n";
int y = 1;
std::cout << "int y\n";
std::cout << "bytes(y) = " << memorysize::bytes(y) << "\n\n";
// make sure it works for vectors of doubles
std::vector<double> doubleVec(10, 1.);
std::cout << "std::vector<double> doubleVec(10, 1.)\n";
std::cout << "bytes(doubleVec) = " << memorysize::bytes(doubleVec) << "\n\n";
// would like a new definition to make this work as expected
std::vector<int> intVec(10, 1);
std::cout << "std::vector<int> intVec(10, 1)\n";
std::cout << "bytes(intVec) = " << memorysize::bytes(intVec) << "\n\n";
return 0;
}
How do I change the template specification to allow for the more general std::vector<ANYTHING> case?
Thanks!
Modified your code accordingly:
/// specialization for a vector of anything
template < typename Anything >
size_t bytes(const std::vector< Anything > & object)
{
return sizeof(std::vector< Anything >) + object.capacity() * bytes( object[0] );
}
Note that now you have a problem if invoking bytes with an empty vector.
Edit: Scratch that. If I remember your previous question correctly, then if you get a vector of strings then you would like to take into account the size taken by each string. So instead you should do
/// specialization for a vector of anything
template < typename Anything >
size_t bytes(const std::vector< Anything > & object)
{
size_t result = sizeof(std::vector< Anything >);
foreach elem in object
result += bytes( elem );
result += ( object.capacity() - object.size() ) * sizeof( Anything ).
return result;
}

multimap accumulate values

I have a multimap defined by
typedef std::pair<int, int> comp_buf_pair; //pair<comp_t, dij>
typedef std::pair<int, comp_buf_pair> node_buf_pair;
typedef std::multimap<int, comp_buf_pair> buf_map; //key=PE, value = pair<comp_t, dij>
typedef buf_map::iterator It_buf;
int summ (int x, int y) {return x+y;}
int total_buf_size = 0;
std::cout << "\nUpdated buffer values" << std::endl;
for(It_buf it = bufsz_map.begin(); it!= bufsz_map.end(); ++it)
{
comp_buf_pair it1 = it->second;
// max buffer size will be summ(it1.second)
//total_buf_size = std::accumulate(bufsz_map.begin(), bufsz_map.end(), &summ); //error??
std::cout << "Total buffers required for this config = " << total_buf_size << std::endl;
std::cout << it->first << " : " << it1.first << " : " << it1.second << std::endl;
}
I would like to sum all the values pointed by it1.second
How can the std::accumulate function access the second iterator values?
Your issue is with the summ function, you actually need something better than that to be able to handle 2 mismatched types.
If you're lucky, this could work:
int summ(int x, buf_map::value_type const& v) { return x + v.second; }
If you're unlucky (depending on how accumulate is implemented), you could always:
struct Summer
{
typedef buf_map::value_type const& s_type;
int operator()(int x, s_type v) const { return x + v.second.first; }
int operator()(s_type v, int x) const { return x + v.second.first; }
};
And then use:
int result = std::accumulate(map.begin(), map.end(), 0, Summer());
I think you'll just need to change your summ function to take the map value_type instead. This is totally untested but it should give the idea.
int summ (int x, const buf_map::value_type& y)
{
return x + y.second;
}
And call it:
total_buf_size = std::accumulate(bufsz_map.begin(), bufsz_map.end(), 0, &summ);
Why do you mess about with pairs containing pairs? It is too complicated and you'll wind up making errors. Why not define a struct?
Accumulate is a generalization of summation: it computes the sum (or some other binary operation) of init and all of the elements in the range [first, last).
... The result is first initialized to init. Then, for each iterator i in [first, last), in order from beginning to end, it is updated by result = result + *i (in the first version) or result = binary_op(result, *i) (in the second version).
Sgi.com
Your attempt was neither first or second version, you missed the init part
total_buf_size = std::accumulate(bufsz_map.begin(), bufsz_map.end(), 0, &summ);