Random element in a map - c++

what is a good way to select a random element from a map? C++. It is my understanding that maps don't have random access iterators. The key is a long long and the map is sparsely populated.

map<...> MyMap;
iterator item = MyMap.begin();
std::advance( item, random_0_to_n(MyMap.size()) );

I like James' answer if the map is small or if you don't need a random value very often. If it is large and you do this often enough to make speed important you might be able to keep a separate vector of key values to select a random value from.
map<...> MyMap;
vector<...> MyVecOfKeys; // <-- add keys to this when added to the map.
map<...>::key_type key = MyVecOfKeys[ random_0_to_n(MyVecOfKeys.size()) ];
map<...>::data_type value = MyMap[ key ];
Of course if the map is really huge you might not be able to store a copy of all the keys like this. If you can afford it though you get the advantage of lookups in logarithmic time.

Maybe draw up a random key, then use lower_bound to find the closest key actually contained.

Continuing ryan_s theme of preconstructed maps and fast random lookup: instead of vector we can use a parallel map of iterators, which should speed up random lookup a bit.
map<K, V> const original;
...
// construct index-keyed lookup map
map<unsigned, map<K, V>::const_iterator> fast_random_lookup;
map<K, V>::const_iterator it = original.begin(), itEnd = original.end();
for (unsigned i = 0; it != itEnd; ++it, ++i) {
fast_random_lookup[i] = it;
}
// lookup random value
V v = *fast_random_lookup[random_0_to_n(original.size())];

If your map is static, then instead of a map, use a vector to store your key/value pairs in key order, binary search to look up values in log(n) time, and the vector index to get random pairs in constant time. You can wrap the vector/binary search to look like a map with a random access feature.

Maybe you should consider Boost.MultiIndex, although note that it's a little too heavy-weighted.

Here is the case when all map items must be access in random order.
Copy the map to a vector.
Shuffle vector.
In pseudo-code (It closely reflects the following C++ implementation):
import random
import time
# populate map by some stuff for testing
m = dict((i*i, i) for i in range(3))
# copy map to vector
v = m.items()
# seed PRNG
# NOTE: this part is present only to reflect C++
r = random.Random(time.clock())
# shuffle vector
random.shuffle(v, r.random)
# print randomized map elements
for e in v:
print "%s:%s" % e,
print
In C++:
#include <algorithm>
#include <iostream>
#include <map>
#include <vector>
#include <boost/date_time/posix_time/posix_time_types.hpp>
#include <boost/foreach.hpp>
#include <boost/random.hpp>
int main()
{
using namespace std;
using namespace boost;
using namespace boost::posix_time;
// populate map by some stuff for testing
typedef map<long long, int> Map;
Map m;
for (int i = 0; i < 3; ++i)
m[i * i] = i;
// copy map to vector
#ifndef OPERATE_ON_KEY
typedef vector<pair<Map::key_type, Map::mapped_type> > Vector;
Vector v(m.begin(), m.end());
#else
typedef vector<Map::key_type> Vector;
Vector v;
v.reserve(m.size());
BOOST_FOREACH( Map::value_type p, m )
v.push_back(p.first);
#endif // OPERATE_ON_KEY
// make PRNG
ptime now(microsec_clock::local_time());
ptime midnight(now.date());
time_duration td = now - midnight;
mt19937 gen(td.ticks()); // seed the generator with raw number of ticks
random_number_generator<mt19937,
Vector::iterator::difference_type> rng(gen);
// shuffle vector
// rng(n) must return a uniformly distributed integer in the range [0, n)
random_shuffle(v.begin(), v.end(), rng);
// print randomized map elements
BOOST_FOREACH( Vector::value_type e, v )
#ifndef OPERATE_ON_KEY
cout << e.first << ":" << e.second << " ";
#else
cout << e << " ";
#endif // OPERATE_ON_KEY
cout << endl;
}

Has anyone tried this?
https://github.com/mabdelazim/Random-Access-Map
"C++ template class for random access map. This is like the std::map but you can access items random by index with syntax my_map.key(i) and my_map.data(i)"

std::random_device dev;
std::mt19937_64 rng(dev());
std::uniform_int_distribution<size_t> idDist(0, elements.size() - 1);
auto elementId= elements.begin();
std::advance(elementId, idDist(rng));
Now elementId is random :)

Related

Vector of set insert elements

I'm trying to write a function which will return vector of set type string which represent members of teams.
A group of names should be classified into teams for a game. Teams should be the same size, but this is not always possible unless n is exactly divisible by k. Therefore, they decided that the first mode (n, k) teams have n / k + 1 members, and the remaining teams have n / k members.
#include <iostream>
#include <vector>
#include <string>
#include <set>
#include <list>
typedef std::vector<std::set<std::string>>vek;
vek Distribution(std::vector<std::string>names, int k) {
int n = names.size();
vek teams(k);
int number_of_first = n % k;
int number_of_members_first = n / k + 1;
int number_of_members_remaining = n / k;
int l = 0;
int j = 0;
for (int i = 1; i <= k; i++) {
if (i <= number_of_first) {
int number_of_members_in_team = 0;
while (number_of_members_in_team < number_of_members_first) {
teams[l].insert(names[j]);
number_of_members_in_team++;
j++;
}
}
else {
int number_of_members_in_team = 0;
while (number_of_members_in_team < number_of_members_remaining) {
teams[l].insert(names[j]);
number_of_members_in_team++;
j++;
}
}
l++;
}
return teams;
}
int main ()
{
for (auto i : Distribution({"Damir", "Ana", "Muhamed", "Marko", "Ivan",
"Mirsad", "Nikolina", "Alen", "Jasmina", "Merima"
}, 3)) {
for (auto j : i)
std::cout << j << " ";
std::cout << std::endl;
}
return 0;
}
OUTPUT should be:
Damir Ana Muhamed Marko
Ivan Mirsad Nikolina
Alen Jasmina Merima
MY OUTPUT:
Ana Damir Marko Muhamed
Ivan Mirsad Nikolina
Alen Jasmina Merima
Could you explain me why names are not printed in the right order?
teams being a std::vector<...> supports random access via an index.
auto & team_i = teams[i]; (0 <= i < teams.size()), will give you an element of the vector. team_i is a reference to type std::set<std::list<std::string>>.
As a std::set<...> does not support random access via an index, you will need to access the elements via iterators (begin(), end() etc.), e.g.: auto set_it = team_i.begin();. *set_it will be of type std::list<std::string>.
Since std::list<...> also does not support random access via an index, again you will need to access it via iterators, e.g.: auto list_it = set_it->begin();. *list_it will be of type std::string.
This way it is possible to access every set in the vector, every list in each set, and every string in each list (after you have added them to the data structure).
However - using iterators with std::set and std::list is not as convenient as using indexed random access with std::vector. std::vector has additional benefits (simple and efficient implementation, continous memory block).
If you use std::vectors instead of std::set and std::list, vek will be defined as:
typedef std::vector<std::vector<std::vector<std::string>>> vek;
std::list being a linked list offers some benefits (like being able to add an element in O(1)). std::set guarentees that each value is present once.
But if you don't really need these features, you could make you code simpler (and often more efficient) if you use only std::vectors as your containers.
Note: if every set will ever contain only 1 list (of strings) you can consider to get rid of 1 level of the hirarchy, I.e. store the lists (or vectors as I suggested) directly as elements of the top-level vector.
UPDATE:
Since the question was changed, here's a short update:
In my answer above, ignore all the mentions of the std::list. So when you iterate on the set::set the elements are already std::strings.
The reason the names are not in the order you expect:
std::set keeps the elements sorted, and when you iterate it you will get the elements by that sorting order. See the answer here: Is the std::set iteration order always ascending according to the C++ specification?. Your set contains std::strings and the default sort order for them is alphabetically.
Using std::vector instead of std::set like I proposed above, will get you the result you wanted (std::vector is not sorted automatically).
If you want to try using only std::vector:
Change vek to:
typedef std::vector<std::vector<std::string>>vek;
And replace the usage of insert (to add an element to the set) with push_back to do the same for a vector.

Using iterators on maps

map<double, LatLon> closestPOI;
map<double, LatLon> ::iterator iterPOI = closestPOI.begin();
I made a tree that is keyed by distance between two points. I need to find the 3 points in this tree that are the smallest (3 smallest distances). I declared an iterator and initialized it to point at the root (I'm not sure if that was necessary but it didn't solve my problem). I tried using advance(iterPOI, 1) to increment the iterator but that didn't work either. How can I find these 3 points and access their values?
Note: Yes I know that the 3 nodes I want are the root and its kids (since they have the smallest distances)
Usually you use a for() loop to iterate the map:
for(map<double, LatLon> ::iterator iterPOI = closestPOI.begin();
iterPOI != closestPOI.end();
++iterPOI) {
// access the iterator's key: iterPOI->first ...
// access the iterator's value: iterPOI->second ...
}
To iterate over a map you can do something like this: (assuming you're using anything over gcc 4.8.2)
map<double, LatLon> closestPOI;
// if you don't have gcc 4.8.2 then do what you did with the iterator in your question...
for(auto i_poi = closestPOI.begin(); i_poi != closestPOI.end(); ++i)
{
// to get at the double you do: i_poi->first
// to get at the LatLon you do: i_poi->second
}
Hope that helps a bit
Here is an example of getting the first (i.e, smallest key) three elements of a map. I've aliased LatLong to int just as an example, so you can see it in action here:
#include <iostream>
#include <map>
#include <vector>
using LatLon = int;
int main()
{
std::map<double, LatLon> map { { 1.0d, 1 }, { 2.0d, 2 }, { 3.0d, 3 }, { 0.5d, 4 } };
// Get the three closest points and store them in a vector
std::vector<LatLon> closest;
for ( const auto& pair : map ) {
if ( closest.size() >= 3 )
break;
closest.push_back(pair.second);
}
// Do something with the three closest points
for ( auto latlon : closest )
std::cout << latlon << '\n';
return 0;
}
Note that if there are less than 3 points in your map to begin with, your closest vector will also have less than 3 elements.

efficient method to select index of vector in c++

In C++, suppose you have a vector with boolean values, and you want to select randomly one index among those corresponding to True values.
What is the most efficient method to use?
Example:
vector<bool> v(4);
v.at(0)=true
v.at(1)=false
v.at(2)=true
v.at(3)=true
You want to select a number among the subset {0,2,3}.
I have so far tried 2 methods:
Stacking indexes in a vector and then selecting among these elements. Extremely slow.
Naive method: randomly select a index until v.at(rnd_sel_index) is True. Considerably faster.
Any suggestions faster than method 2?
Perhaps there's a more efficient approach.
Rather than storing what is there and what is not, perhaps it's better to store only what is not - i.e. a vector containing indices that are free.
the order of this vector can be easily randomised once, and you can then pull items from the back() until it's empty().
When you want to return items to the 'free index pool', simply insert them in a random position in the vector.
You can use the well-known method for selecting an element from a sequence of unknown length.
Example Code:
#include <random>
#include <iostream>
#include <vector>
#include <algorithm>
std::size_t choose_element(const std::vector<bool>& v) {
auto last = v.end();
auto chosen_i = std::find(v.begin(), last, true);
auto i = std::find(std::next(chosen_i), last, true);
double n = 2.0;
static auto random_generator = std::mt19937{std::random_device{}()};
while (i != last) {
if (std::bernoulli_distribution(1.0 / n)(random_generator))
chosen_i = i;
i = std::find(std::next(i), last, true);
++n;
}
return std::distance(v.begin(), chosen_i);
}
int main() {
std::vector<bool> v = {true, true, false, true};
std::vector<int> indexes(v.size());
const double N = 100;
for (int i=0; i<N; ++i)
++indexes[choose_element(v)];
for (auto& index : indexes)
std::cout << std::distance(indexes.data(), &index) << ": " << (index / N) << "\n";
return 0;
}
This has predictable performance and only takes one pass through the data. Of course if you are taking multiple samples from the same vector it may be more efficient to restructure the data to a different format and then draw from that. Also, if nearly all of the elements are true, your method (2) might perform better in the average case.

Select random element in an unordered_map

I define an unordered_map like this:
std::unordered_map<std::string, Edge> edges;
Is there a efficient way to choose a random Edge from the unordered_map edges ?
Pre-C++11 solution:
std::tr1::unordered_map<std::string, Edge> edges;
std::tr1::unordered_map<std::string, Edge>::iterator random_it = edges.begin();
std::advance(random_it, rand_between(0, edges.size()));
C++11 onward solution:
std::unordered_map<std::string, Edge> edges;
auto random_it = std::next(std::begin(edges), rand_between(0, edges.size()));
The function that selects a valid random number is up to your choice, but be sure it returns a number in range [0 ; edges.size() - 1] when edges is not empty.
The std::next function simply wraps the std::advance function in a way that permits direct assignation.
Is there a efficient way to choose a random Edge from the unordered_map edges ?
If by efficient you mean O(1), then no, it is not possible.
Since the iterators returned by unordered_map::begin / end are ForwardIterators, the approaches that simply use std::advance are O(n) in the number of elements.
If your specific use allows it, you can trade some randomness for efficiency:
You can select a random bucket (that can be accessed in O(1)), and then a random element inside that bucket.
int bucket, bucket_size;
do
{
bucket = rnd(edges.bucket_count());
}
while ( (bucket_size = edges.bucket_size(bucket)) == 0 );
auto element = std::next(edges.begin(bucket), rnd(bucket_size));
Where rnd(n) returns a random number in the [0,n) range.
In practice if you have a decent hash most of the buckets will contain exactly one element, otherwise this function will slightly privilege the elements that are alone in their buckets.
Strict O(1) solution without buckets:
Keep a vector of keys, when you need to get a random element from your map, select a random key from the vector and return corresponding value from the map - takes constant time
If you insert a key-value pair into your map, check if such key is already present, and if it's not the case, add that key to your key vector - takes constant time
If you want to remove an element from the map after it was selected, swap the key you selected with the back() element of your key vector and call pop_back(), after that erase the element from the map and return the value - takes constant time
However, there is a limitation: if you want to delete elements from the map aside from random picking, you need to fix your key vector, this takes O(n) with naive approach. But still there is a way to get O(1) performance: keep a map that tells you where the key is in the key vector and update it with swap :)
This is how you can get random element from a map:
std::unordered_map<std::string, Edge> edges;
iterator item = edges.begin();
int random_index = rand() % edges.size();
std::advance(item, random_index);
Or take a look at this answer, which provides the following solution:
std::unordered_map<std::string, Edge> edges;
iterator item = edges.begin();
std::advance( item, random_0_to_n(edges.size()) );
The solution of
std::unordered_map<std::string, Edge> edges;
auto random_it = std::next(std::begin(edges), rand_between(0, edges.size()));
is extremely slow....
A much faster solution will be:
when assigning edges, simutaneously emplaces its keys to std::vector<std::string> vec
random an int index ranging from 0 to vec.size() - 1
then get edges[vec[index]]
you can see this problem:
problem 380. Insert Delete GetRandom O(1)
you can build a vector to use vector random iterators, get random values more efficiently. Like this:
class RandomizedSet {
public:
unordered_map<int, int> m;
vector<int> data;
RandomizedSet() {
}
bool insert(int val) {
if(m.count(val)){
return false;
} else{
int index = data.size();
data.push_back(val);
m[val] = index;
return true;
}
}
bool remove(int val) {
if(m.count(val)){
int curr_index = m[val];
int max_index = data.size()-1;
m[data[max_index]] = curr_index;
swap(data[curr_index], data[max_index]);
data.pop_back();
m.erase(val);
return true;
} else{
return false;
}
}
int getRandom() {
return data[rand() % data.size()];
}
};
/**
* Your RandomizedSet object will be instantiated and called as such:
* RandomizedSet* obj = new RandomizedSet();
* bool param_1 = obj->insert(val);
* bool param_2 = obj->remove(val);
* int param_3 = obj->getRandom();
*/

the value of iterator

i created a map.
i want to print the index of the key to a file using the itr in the map.
this is what i mean:
map <string,int> VendorList;
VendorList[abc] = 0;
VendorList[mazda] = 111;
VendorList[ford] = 222;
VendorList[zoo] = 444;
map <string,int>::iterator itr=VendorList.find("ford");
fstream textfile;
textfile << itr;
if i put in the find line abc i wish the program to cout 1.
if i put in the find line mazda i wish the program to cout 2.
if i put in the find line ford i wish the program to cout 3.
if i put in the find line zoo i wish the program to cout 4.
how do i do that?
the compiler is shouting on the line:
textfile << itr;
it gives this error:
error C2679: binary '<<' : no operator found which takes a right-hand operand of type 'std::_Tree<_Traits>::iterator' (or there is no acceptable conversion)
Your program has many bugs. Frankly speaking I am not sure about your requirement.
But anyways try this :
map <string,int> VendorList;
VendorList["abc"] = 1;
VendorList["mazda"] = 2;
VendorList["ford"] = 3;
VendorList["zoo"] = 4;
map <string,int>::iterator itr=VendorList.find("ford");
cout<<(itr->second);// this will print 3
EDIT :
Also as somebody has suggested to use vector of pairs,I think he is right. Try something like this.
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{
typedef vector<pair<string,int> > Vm;
Vm V;
V.push_back(make_pair("abc",0));
V.push_back(make_pair("mazda",111));
V.push_back(make_pair("ford",222));
V.push_back(make_pair("zoo",444));
for(size_t i=0;i!=V.size();++i)
if(V[i].first=="ford")
cout<<(i+1);
}
Modify the above program as per requirement.
Hope that helps.
In map, the elements aren't stored in the order of insertion, so you have to hold the "order" data yourself.
I would suggest you to consider using a vector of pairs instead of a map. Vector does store the elements in the order of insertion, and its iterator is Random-Access so you will be able to check the position using the operator-.
vector <pair<string, int> >::iterator itr;
// itr = the needed element
cout << itr - VendorList.begin();
As such, the concept of 'index' doesn't really fit with Maps.
Maps are just key-value pairs where you store a value (say, '111') and access it using a key (say 'mazda'). In this way you don't really need an index in order to access '111', you can just use the key 'mazda'.
If you do want your application to be index based however, consider using a different data structure like a Vector or a Linked List.