Create hash map with parameters ordered in the way inserted - c++

I want to change the std::map (it's hash function) so when iterated to return pairs in the way they are inserted. I have tried with unordered map, but with no success. So I guess I have to create a hash function that is incrementing and returns bigger value every time. How can I do that? I'm not working with big data, so performance is not a problem.

It sounds like you want an ordered map, a hashmap which allows O(1) lookups but remembers the order of insertion (for iteration order). Luckily, an efficient implementation exists in C++:
https://github.com/Tessil/ordered-map
An example from the README is:
#include <iostream>
#include <tsl/ordered_map.h>
int main()
{
tsl::ordered_map<char, int> map = {{'d', 1}, {'a', 2}, {'g', 3}};
map.insert({'b', 4});
map['h'] = 5;
map['e'] = 6;
map.erase('a');
// {d, 1} {g, 3} {b, 4} {h, 5} {e, 6}
for(const auto& key_value : map) {
std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
}
return 0;
}
You may also implement it like Python's ordered map, which uses a linked list to track insertion order. An advantage is more efficient deletions (Tessil's ordered-map has O(n) deletions, while a linked list/map would be average O(1)). Assuming the map would normally store a key/value type of type Key, T, you would store T in the linked list, and store an iterator as the internal unordered_map's value type. You would then wrap the class and any iterators to ensure it acts from the outside like a normal hashmap.
The basic outline would look something like the following (the rest is for you to implement, if O(n) deletions are unacceptable). Lists are used since they do not invalidate iterators when an element is removed, and the map is used to find the correct list iterator for deletions by key. Satisfies all the big O issues, albeit with some overhead.
#include <list>
#include <unordered_map>
template <
typename Key,
typename T,
typename Hash = std::hash<Key>,
typename KeyEqual = std::equal_to<Key>,
typename Allocator = allocator<std::pair<Key, T>>
>
class ordered_map
{
public:
using key_type = Key;
using mapped_type = T;
// ....
private:
using list_type = list<T, typename std::allocator_traits<Allocator>::template rebind_alloc<T>>;
using list_iterator = typename list_type::iterator;
using map_type = std::unordered_map<
Key, list_iterator, Hasah, KeyEqual,
typename std::allocator_traits<Allocator>::template rebind_alloc<std::pair<Key, list_iterator>>
>;
list_type list_;
map_type map_;
};

I think it would be better to use:
std::vector<std::pair<key,value> myValues.
For insertion you would use code like this
myValues.emplace_back(std::make_pair<key,value>(someKey, someValue));
Preferably you will do this in a loop. Then when you want to get items in order they were inserted you simply loop on index For example using for-ranged based loop to output elements in order they were insterted:
for(auto & pair : myValues)
{
std::cout << "Key: " << pair.first << " Value: " << pair.second << std::endl;
}
Doing this on unordered_map or regular map container would be bad use case for such container.

Related

How can I reverse loop over a map by value?

I need to get the pairs of the map sorted by its values, i wonder if it is posible without an temporal declaration.
I know i can sort it if i make another map with the keys and values swaped, but i am searching for a better solution.
I can't sort the elements afterwards because i only need extract the chars and put them on an array for example.
std::map<char,int> list = {{'A',4},{'V',2},{'N',1},{'J',5},{'G',3}};
for(/* code here */){
std::cout << /* code here */ << std::endl;
}
Desired outout:
J 5
A 4
G 3
V 2
N 1
This cannot be done with std::map.
This template has an optional template argument which allows for a custom sorting, but this sorting can only be done on the map's key:
template<
class Key,
class T,
class Compare = std::less<Key>,
class Allocator = std::allocator<std::pair<const Key, T> >
> class map;
std::map is a sorted associative container that contains key-value pairs with unique keys. Keys are sorted by using the comparison function Compare.
This choice has been done as to not impose the map's value to be an orderable type.
As an alternative, you can use type std::vector<std::pair<char, int>> in combination with std::sort.
#include <vector>
#include <utility>
#include <algorithm>
#include <iostream>
int main()
{
std::vector<std::pair<char,int>> list = {{'A',4},{'V',2},{'N',1},{'J',5},{'G',3}};
std::sort(begin(list), end(list), [](auto lhs, auto rhs) {
return lhs.second > rhs.second ? true : ( rhs.first > rhs.first );
});
for(auto const& pair : list) {
std::cout << pair.first << ", " << pair.second << '\n';
}
}
J, 5
A, 4
G, 3
V, 2
N, 1
Live demo
The most appropriate solution depends on HOW are you going to use that list. Is it created once and queried once? Then anything will do: you can just sort the output.
However, if this is a "live" list that is constantly updated, and you need to query it frequently, I would suggest to keep another map of values to vector of keys. Then you would be able to get your result instantly. At a cost
of more expensive insertion, of course.

STL structures: "insert if not present" operation?

I'm working on a program right now dealing with some exponential time algorithms. Because of this, a main loop of my program is being run many times, and I'm trying to optimize it as much as possible.
Profiling shows that a large portion of the time is spent in look-up and hash calculation for std::unordered_map.
I'm wondering:
Is there a way to cache the hash value of a key for std::unordered_map, and then provide it as an argument to insert later on?
Is there a way that I can do the following in a single operation: given an key and value {x,y}, check if key x is in the map, if it isn't, insert it and return {x,y}, otherwise return {x,z} for whatever z is already in the map.
I'm doing something like this right now, but it's inefficient, because I have to calculate the hash for the key and check if it's in the map. But if it isn't in the map, I do a completely separate insert operation. In theory, checking if it is present in the map should find where it would go in the map if inserted.
I'm open to trying other data structures, like std::map or something from Boost, if they would reduce the time for this operation.
You could just use the return value of std::unordered_map::insert() to achieve key existence checking + insertion with single hash calculation.
template<typename K, typename V>
std::pair<K, V> myinsert(std::unordered_map<K, V> &map, const std::pair<K, V> &kv)
{
return *(map.insert(kv).first);
}
You can't cache the hash of the key, but if you have an iterator to where it was last time (either from when you originally inserted, or the last time you successfully found the item) you can use the insert( const_iterator hint, value_type&& value ); member which also helpfully returns an iterator to either the newly inserted element or the previously existing element that blocked insertion.
You could just use std::map::emplace().
The insertion only takes place if no other element in the container has a key equivalent to the one being emplaced (keys in a map container are unique).
Example
#include <iostream>
#include <utility>
#include <string>
#include <map>
typedef std::map<std::string, std::string> StringMap;
typedef std::pair<StringMap::iterator, bool> PairKey;
int main()
{
std::map<std::string, std::string> m;
// uses pair's template constructor
m.emplace("a", "aaa");
m.emplace("b", "bbb");
m.emplace("d", "ddd");
PairKey pair = m.emplace("d", "dddddd");
if (pair.second == false)
std::cout<< "d was existed so value didn't change" << std::endl;
std::cout<<"-------MAP_LIST------" << std::endl;
for (const auto &p : m)
std::cout << p.first << " => " << p.second << '\n';
std::cout<<"-------========------" << std::endl;
}
Output:
d was existed so value didn't change
-------MAP_LIST------
a => aaa
b => bbb
d => ddd
-------========------

How to randomly shuffle values in a map?

I have a std::map with both key and value as integers. Now I want to randomly shuffle the map, so keys point to a different value at random. I tried random_shuffle but it doesn't compile. Note that I am not trying to shuffle the keys, which makes no sense for a map. I'm trying to randomise the values.
I could push the values into a vector, shuffle that and then copy back. Is there a better way?
You can push all the keys in a vector, shuffle the vector and use it to swap the values in the map.
Here is an example:
#include <iostream>
#include <string>
#include <vector>
#include <map>
#include <algorithm>
#include <random>
#include <ctime>
using namespace std;
int myrandom (int i) { return std::rand()%i;}
int main ()
{
srand(time(0));
map<int,string> m;
vector<int> v;
for(int i=0; i<10; i++)
m.insert(pair<int,string>(i,("v"+to_string(i))));
for(auto i: m)
{
cout << i.first << ":" << i.second << endl;
v.push_back(i.first);
}
random_shuffle(v.begin(), v.end(),myrandom);
vector<int>::iterator it=v.begin();
cout << endl;
for(auto& i:m)
{
string ts=i.second;
i.second=m[*it];
m[*it]=ts;
it++;
}
for(auto i: m)
{
cout << i.first << ":" << i.second << endl;
}
return 0;
}
The complexity of your proposal is O(N), (both the copies and the shuffle have linear complexity) which seems optimal (looking at less elements would introduce non-randomness into your shuffle).
If you want to repeatedly shuffle your data, you could maintain a map of type <Key, size_t> (i.e. the proverbial level of indirection) that indexes into a std::vector<Value> and then just shuffle that vector repeatedly. That saves you all the copying in exchange for O(N) space overhead. If the Value type itself is expensive, you have an extra vector<size_t> of indices into the real data on which you do the shuffling.
For convenience sake, you could encapsulate the map and vector inside one class that exposes a shuffle() member function. Such a wrapper would also need to expose the basic lookup / insertion / erase functionality of the underyling map.
EDIT: As pointed out by #tmyklebu in the comments, maintaining (raw or smart) pointers to secondary data can be subject to iterator invalidation (e.g. when inserting new elements at the end that causes the vector's capacity to be resized). Using indices instead of pointers solves the "insertion at the end" problem. But when writing the wrapper class you need to make sure that insertions of new key-value pairs never cause "insertions in the middle" for your secondary data because that would also invalidate the indices. A more robust library solution would be to use Boost.MultiIndex, which is specifically designed to allow multiple types of view over a data structure.
Well, with only using the map i think of that:
make a flag array for each cell of the map, randomly generate two integers s.t. 0<=i, j < size of map; swap them and mark these cells as swapped. iterate for all.
EDIT: the array is allocate by the size of the map, and is a local array.
I doubt it...
But... Why not write a quick class that has 2 vectors in. A sorted std::vector of keys and a std::random_shuffled std::vector of values? Lookup the key using std::lower_bound and use std::distance and std::advance to get the value. Easy!
Without thinking too deeply, this should have similar complexity to std::map and possibly better locality of reference.
Some untested and unfinished code to get you started.
template <class Key, class T>
class random_map
{
public:
T& at(Key const& key);
void shuffle();
private:
std::vector<Key> d_keys; // Hold the keys of the *map*; MUST be sorted.
std::vector<T> d_values;
}
template <class Key, class T>
T& random_map<Key, T>::at(Key const& key)
{
auto lb = std::lower_bound(d_keys.begin(), d_keys.end(), key);
if(key < *lb) {
throw std::out_of_range();
}
auto delta = std::difference(d_keys.begin(), lb);
auto it = std::advance(d_values.begin(), lb);
return *it;
}
template <class Key, class T>
void random_map<Key, T>::shuffle()
{
random_shuffle(d_keys.begin(), d_keys.end());
}
If you want to shuffle the map in place, you can implement your own version of random_shuffle for your map. The solution still requires placing the keys into a vector, which is done below using transform:
typedef std::map<int, std::string> map_type;
map_type m;
m[10] = "hello";
m[20] = "world";
m[30] = "!";
std::vector<map_type::key_type> v(m.size());
std::transform(m.begin(), m.end(), v.begin(),
[](const map_type::value_type &x){
return x.first;
});
srand48(time(0));
auto n = m.size();
for (auto i = n-1; i > 0; --i) {
map_type::size_type r = drand48() * (i+1);
std::swap(m[v[i]], m[v[r]]);
}
I used drand48()/srand48() for a uniform pseudo random number generator, but you can use whatever is best for you.
Alternatively, you can shuffle v, and then rebuild the map, such as:
std::random_shuffle(v.begin(), v.end());
map_type m2 = m;
int i = 0;
for (auto &x : m) {
x.second = m2[v[i++]];
}
But, I wanted to illustrate that implementing shuffle on the map in place isn't overly burdensome.
Here is my solution using std::reference_wrapper of C++11.
First, let's make a version of std::random_shuffle that shuffles references. It is a small modification of version 1 from here: using the get method to get to the referenced values.
template< class RandomIt >
void shuffleRefs( RandomIt first, RandomIt last ) {
typename std::iterator_traits<RandomIt>::difference_type i, n;
n = last - first;
for (i = n-1; i > 0; --i) {
using std::swap;
swap(first[i].get(), first[std::rand() % (i+1)].get());
}
}
Now it's easy:
template <class MapType>
void shuffleMap(MapType &map) {
std::vector<std::reference_wrapper<typename MapType::mapped_type>> v;
for (auto &el : map) v.push_back(std::ref(el.second));
shuffleRefs(v.begin(), v.end());
}

Make Map Key Sorted According To Insert Sequence

Without help from additional container (like vector), is it possible that I can make map's key sorted same sequence as insertion sequence?
#include <map>
#include <iostream>
using namespace std;
int main()
{
map<const char*, int> m;
m["c"] = 2;
m["b"] = 2;
m["a"] = 2;
m["d"] = 2;
for (map<const char*, int>::iterator begin = m.begin(); begin != m.end(); begin++) {
// How can I get the loop sequence same as my insert sequence.
// c, b, a, d
std::cout << begin->first << std::endl;
}
getchar();
}
No. A std::map is a sorted container; the insertion order is not maintained. There are a number of solutions using a second container to maintain insertion order in response to another, related question.
That said, you should use std::string as your key. Using a const char* as a map key is A Bad Idea: it makes it near impossible to access or search for an element by its key because only the pointers will be compared, not the strings themselves.
No. std::map<Key, Data, Compare, Alloc> is sorted according to the third template parameter Compare, which defaults to std::less<Key>. If you want insert sequence you can use std::list<std::pair<Key, Data> >.
Edit:
As was pointed out, any sequential STL container would do: vector, deque, list, or in this particular case event string. You would have to decide on the merits of each.
Consider using a boost::multi_index container instead of a std::map. You can put both an ordered map index and an unordered sequential index on your container.

C++ STL map I don't want it to sort!

This is my code
map<string,int> persons;
persons["B"] = 123;
persons["A"] = 321;
for(map<string,int>::iterator i = persons.begin();
i!=persons.end();
++i)
{
cout<< (*i).first << ":"<<(*i).second<<endl;
}
Expected output:
B:123
A:321
But output it gives is:
A:321
B:123
I want it to maintain the order in which keys and values were inserted in the map<string,int>.
Is it possible? Or should I use some other STL data structure? Which one?
There is no standard container that does directly what you want. The obvious container to use if you want to maintain insertion order is a vector. If you also need look up by string, use a vector AND a map. The map would in general be of string to vector index, but as your data is already integers you might just want to duplicate it, depending on your use case.
Like Matthieu has said in another answer, the Boost.MultiIndex library seems the right choice for what you want. However, this library can be a little tough to use at the beginning especially if you don't have a lot of experience with C++. Here is how you would use the library to solve the exact problem in the code of your question:
struct person {
std::string name;
int id;
person(std::string const & name, int id)
: name(name), id(id) {
}
};
int main() {
using namespace::boost::multi_index;
using namespace std;
// define a multi_index_container with a list-like index and an ordered index
typedef multi_index_container<
person, // The type of the elements stored
indexed_by< // The indices that our container will support
sequenced<>, // list-like index
ordered_unique<member<person, string,
&person::name> > // map-like index (sorted by name)
>
> person_container;
// Create our container and add some people
person_container persons;
persons.push_back(person("B", 123));
persons.push_back(person("C", 224));
persons.push_back(person("A", 321));
// Typedefs for the sequence index and the ordered index
enum { Seq, Ord };
typedef person_container::nth_index<Seq>::type persons_seq_index;
typedef person_container::nth_index<Ord>::type persons_ord_index;
// Let's test the sequence index
persons_seq_index & seq_index = persons.get<Seq>();
for(persons_seq_index::iterator it = seq_index.begin(),
e = seq_index.end(); it != e; ++it)
cout << it->name << ":"<< it->id << endl;
cout << "\n";
// And now the ordered index
persons_ord_index & ord_index = persons.get<Ord>();
for(persons_ord_index::iterator it = ord_index.begin(),
e = ord_index.end(); it != e; ++it)
cout << it->name << ":"<< it->id << endl;
cout << "\n";
// Thanks to the ordered index we have fast lookup by name:
std::cout << "The id of B is: " << ord_index.find("B")->id << "\n";
}
Which produces the following output:
B:123
C:224
A:321
A:321
B:123
C:224
The id of B is: 123
Map is definitely not right for you:
"Internally, the elements in the map are sorted from lower to higher key value following a specific strict weak ordering criterion set on construction."
Quote taken from here.
Unfortunately there is no unordered associative container in the STL, so either you use a nonassociative one like vector, or write your own :-(
I had the same problem every once in a while and here is my solution: https://github.com/nlohmann/fifo_map. It's a header-only C++11 solution and can be used as drop-in replacement for a std::map.
For your example, it can be used as follows:
#include "fifo_map.hpp"
#include <string>
#include <iostream>
using nlohmann::fifo_map;
int main()
{
fifo_map<std::string,int> persons;
persons["B"] = 123;
persons["A"] = 321;
for(fifo_map<std::string,int>::iterator i = persons.begin();
i!=persons.end();
++i)
{
std::cout<< (*i).first << ":"<<(*i).second << std::endl;
}
}
The output is then
B:123
A:321
Besides Neil's recommendation of a combined vector+map if you need both to keep the insertion order and the ability to search by key, you can also consider using boost multi index libraries, that provide for containers addressable in more than one way.
maps and sets are meant to impose a strict weak ordering upon the data. Strick weak ordering maintains that no entries are equavalent (different to being equal).
You need to provide a functor that the map/set may use to perform a<b. With this functor the map/set sorts its items (in the STL from GCC it uses a red-black tree). It determines weather two items are equavalent if !a<b && !b<a -- the equavelence test.
The functor looks like follows:
template <class T>
struct less : binary_function<T,T,bool> {
bool operator() (const T& a, const T& b) const {
return a < b;
}
};
If you can provide a function that tells the STL how to order things then the map and set can do what you want. For example
template<typename T>
struct ItemHolder
{
int insertCount;
T item;
};
You can then easily write a functor to order by insertCount. If your implementation uses red-black trees your underlying data will remain balanced -- however you will get a lot of re-balancing since your data will be generated based on incremental ordering (vs. Random) -- and in this case a list with push_back would be better. However you cannot access data by key as fast as you would with a map/set.
If you want to sort by string -- provide the functor to search by string, using the insertCount you could potentiall work backwards. If you want to search by both you can have two maps.
map<insertcount, string> x; // auxhilary key
map<string, item> y; //primary key
I use this strategy often -- however I have never placed it in code that is run often. I'm considering boost::bimap.
Well, there is no STL container which actually does what you wish, but there are possibilities.
1. STL
By default, use a vector. Here it would mean:
struct Entry { std::string name; int it; };
typedef std::vector<Entry> container_type;
If you wish to search by string, you always have the find algorithm at your disposal.
class ByName: std::unary_function<Entry,bool>
{
public:
ByName(const std::string& name): m_name(name) {}
bool operator()(const Entry& entry) const { return entry.name == m_name; }
private:
std::string m_name;
};
// Use like this:
container_type myContainer;
container_type::iterator it =
std::find(myContainer.begin(), myContainer.end(), ByName("A"));
2. Boost.MultiIndex
This seems way overkill, but you can always check it out here.
It allows you to create ONE storage container, accessible via various indexes of various styles, all maintained for you (almost) magically.
Rather than using one container (std::map) to reference a storage container (std::vector) with all the synchro issues it causes... you're better off using Boost.
For preserving all the time complexity constrains you need map + list:
struct Entry
{
string key;
int val;
};
typedef list<Entry> MyList;
typedef MyList::iterator Iter;
typedef map<string, Iter> MyMap;
MyList l;
MyMap m;
int find(string key)
{
Iter it = m[key]; // O(log n)
Entry e = *it;
return e.val;
}
void put(string key, int val)
{
Entry e;
e.key = key;
e.val = val;
Iter it = l.insert(l.end(), e); // O(1)
m[key] = it; // O(log n)
}
void erase(string key)
{
Iter it = m[key]; // O(log n)
l.erase(it); // O(1)
m.erase(key); // O(log n)
}
void printAll()
{
for (Iter it = l.begin(); it != l.end(); it++)
{
cout<< it->key << ":"<< it->val << endl;
}
}
Enjoy
You could use a vector of pairs, it is almost the same as unsorted map container
std::vector<std::pair<T, U> > unsorted_map;
Use a vector. It gives you complete control over ordering.
I also think Map is not the way to go. The keys in a Map form a Set; a single key can occur only once. During an insert in the map the map must search for the key, to ensure it does not exist or to update the value of that key. For this it is important (performance wise) that the keys, and thus the entries, have some kind of ordering. As such a Map with insert ordering would be highly inefficient on inserts and retrieving entries.
Another problem would be if you use the same key twice; should the first or the last entry be preserved, and should it update the insert order or not?
Therefore I suggest you go with Neils suggestion, a vector for insert-time ordering and a Map for key-based searching.
Yes, the map container is not for you.
As you asked, you need the following code instead:
struct myClass {
std::string stringValue;
int intValue;
myClass( const std::string& sVal, const int& iVal ):
stringValue( sVal ),
intValue( iVal) {}
};
std::vector<myClass> persons;
persons.push_back( myClass( "B", 123 ));
persons.push_back( myClass( "A", 321 ));
for(std::vector<myClass>::iterator i = persons.begin();
i!=persons.end();
++i)
{
std::cout << (*i).stringValue << ":" << (*i).intValue << std::endl;
}
Here the output is unsorted as expected.
Map is ordered collection (second parametr in template is a order functor), as set. If you want to pop elements in that sequenses as pushd you should use deque or list or vector.
In order to do what they do and be efficient at it, maps use hash tables and sorting. Therefore, you would use a map if you're willing to give up memory of insertion order to gain the convenience and performance of looking up by key.
If you need the insertion order stored, one way would be to create a new type that pairs the value you're storing with the order you're storing it (you would need to write code to keep track of the order). You would then use a map of string to this new type for storage. When you perform a look up using a key, you can also retrieve the insertion order and then sort your values based on insertion order.
One more thing: If you're using a map, be aware of the fact that testing if persons["C"] exists (after you've only inserted A and B) will actually insert a key value pair into your map.
Instead of map you can use the pair function with a vector!
ex:
vector<::pair<unsigned,string>> myvec;
myvec.push_back(::pair<unsigned,string>(1,"a"));
myvec.push_back(::pair<unsigned,string>(5,"b"));
myvec.push_back(::pair<unsigned,string>(3,"aa"));`
Output:
myvec[0]=(1,"a"); myvec[1]=(5,"b"); myvec[2]=(3,"aa");
or another ex:
vector<::pair<string,unsigned>> myvec2;
myvec2.push_back(::pair<string,unsigned>("aa",1));
myvec2.push_back(::pair<string,unsigned>("a",3));
myvec2.push_back(::pair<string,unsigned>("ab",2));
Output: myvec2[0]=("aa",1); myvec2[1]=("a",3); myvec2[2]=("ab",2);
Hope this can help someone else in the future who was looking for non sorted maps like me!
struct Compare : public binary_function<int,int,bool> {
bool operator() (int a, int b) {return true;}
};
Use this to get all the elements of a map in the reverse order in which you entered (i.e.: the first entered element will be the last and the last entered element will be the first). Not as good as the same order but it might serve your purpose with a little inconvenience.
Use a Map along with a vector of iterators as you insert in Map. (Map iterators are guaranteed not to be invalidated)
In the code below I am using Set
set<string> myset;
vector<set<string>::iterator> vec;
void printNonDuplicates(){
vector<set<string>::iterator>::iterator vecIter;
for(vecIter = vec.begin();vecIter!=vec.end();vecIter++){
cout<<(*vecIter)->c_str()<<endl;
}
}
void insertSet(string str){
pair<set<string>::iterator,bool> ret;
ret = myset.insert(str);
if(ret.second)
vec.push_back(ret.first);
}
If you don't want to use boost::multi_index, I have put a proof of concept class template up for review here:
https://codereview.stackexchange.com/questions/233157/wrapper-class-template-for-stdmap-stdlist-to-provide-a-sequencedmap-which
using std::map<KT,VT> and std::list<OT*> which uses pointers to maintain the order.
It will take O(n) linear time for the delete because it needs to search the whole list for the right pointer. To avoid that would need another cross reference in the map.
I'd vote for typedef std::vector< std::pair< std::string, int > > UnsortedMap;
Assignment looks a bit different, but your loop remains exactly as it is now.
There is std::unordered_map that you can check out. From first view, it looks like it might solve your problem.