Loss of data while ordering an unordered_map c++ - c++

I have a unordered_map<string, int> freq and I order it transforming it into a
map<int,string> freq2. I use the next function in order to do that:
map<int, string> order(unordered_map<string, int> x) {
map <int, string> map;
for (auto it = x.begin(); it != x.end(); ++it) {
map.emplace(it->second, it->first);
}
return map;
}
the size of the unordered_mapis 2355831 and the returned map is 505, so as you see the loss of data is quite big and i have no idea why....
Any idea why this happens?
Thanks.
EDIT:
Thanks to all, you are all right, I have a lot of int with same value, that´s why i loose the data( really stupid from my part to not see it before)

Most likely this is because there are duplicates among the int values. Try replacing map<int, string> with multimap<int, string>.

The code itself looks fine. However, since you are mapping from string keys to integers, it might be very well that you have multiple keys with the same value.
From the documentation of emplace:
The insertion only takes place if no other element in the container has a key equivalent to the one being emplaced (keys in a map container are unique).
So if a lot of your entries in the first map have the same value (which is the key in the second map), then your dataset will decrease by a lot.
If you need to preserve those elements, then std::map is not the right container.

Related

Push_back into map<int,vector<char>>*

c++
map<int, vector>* maxcounts;
When I have a pointer to map maxcount how do I write this next statement correctly?
maxcounts[-m.second]->push_back(m.first);
without referencing a pointer I write
maxcounts[-m.second].push_back(m.first);
map<int, vector<char>> maxcounts;
for (pair<char, int> m : counts) {
if (maxcounts.count(-m.second))
maxcounts[-m.second].push_back(m.first);
else
maxcounts.insert({ -m.second, {m.first} });
}
To figure out how to use a pointer to the map, first rewrite your loop this way:
std::map<char, int> counts;
//...
std::map<int, std::vector<char>> maxcounts;
for (std::pair<char, int> m : counts)
maxcounts.insert({-m.second, std::vector<char>()}).first->second.push_back(m.first);
Note that the return value for std::map::insert is a std::pair, where the first of the pair is an iterator to the existing item if the item already is in the map, or the iterator to the newly inserted item. Thus you can perform the test and insert in one line without need for an if statement.
The push_back will occur, regardless of whether the item inserted in the map is new or if the item existed. Note that for a new entry, the std::vector being inserted starts as empty.
Given this, the pointer to the map version is very simple:
std::map<char, int> counts;
//...
map<int, vector<char>>* maxcounts;
//
for (pair<char, int> m : counts)
maxcounts->insert({-m.second, std::vector<char>()}).first->second.push_back(m.first);
Now, why you need a pointer to a map in the first place is another issue, but to answer your question, the above should work.
I would likely write something like:
std::map<int, std::vector<int>>* maxcounts = ...;
for (std::pair<char, int> m : counts)
(*maxcounts)[-m.second].push_back(m.first);

how to traverse in a unordered_map of unordered_map of unordered_map in c++

I wanted to traverse inside a data structure - unordered_map<int, unordered_map<int, unordered_map<int, int>>> myMap. To further specify I want to get the data elements like ->
myMap[someVal1][someVal2]
{all second elements of this unordered map}
I am aware of the fact that the same could by done by a 3d array however using a 3d array would not be efficient as the data range is huge and the program would end up using far more space than required.I tried using some iterators like unordered_map<int, unordered_map<int, unordered_map<int, int>>>::iterator i, and several other such iterators however it always ends up in some error or the other. Could someone help me in understanding how this map can be traversed ? Thanks in advance!
You could traverse the map with a foreach loop (it needs C++11, I think that won't be a problem), if you don't want to use iterators.
myMap mapMapMap;
for(auto& mapMap : mapMapMap){
for(auto& map : mapMap.second){
for(auto& key_value : map.second){
int key = key_value.first;
int value = key_value.second;
// ....
}
}
}
Also, if you didn't want to iterate all the map, but only the values of the third level, given the two first, then this should make it:
int k1, k2;
for(auto& key_value : myMap.at(k1).at(k2)){
//...
}

sorting std::map by value

Right now I have a map and I need to sort it by value(int), and then by key(string) if there is a tie. I know I would need to write a customized comparison function for this, however so far I haven't been able to make it work.
(I need to store my stuffs in a map since the strings are words and ints are the frequencies and I will need to 'find' the pairs by searching the keys later)
The std::map can only be sorted by key (string in your case).
If you need to sort it by value as well, you'd need to create a std::multimap with the int as key and the string as value, and populate it by iterating over the map.
Alternatively, you could also create a vector<pair<int,string>> that you populate by iteration over the map and just use std::sort().
You can use a std::multiset<std::pair<int, std::string>>
With the information given, it's a bit of a guessing game, but unless you are shuffling massive amounts of data, this may do.
using entry = std::pair<std::string, int>;
using CompareFunc = bool(*)(const entry&, const entry&);
using sortset = std::set<entry, CompareFunc>;
sortset bv(themap.begin(), themap.end(), [](auto& a, auto&b){ a.second!=b.second?a.second<b.second:a.first<b.first; });
for(const auto& d : bv) {
//
}

Efficiently iterate multiple maps with the same keys

If I have two maps which are guaranteed to have exactly the same set of keys, how can I efficiently iterate through both maps?
For example, say I have the following maps:
std::map<std::string, int> iMap;
std::map<std::string, std::vector<int> > vMap;
At some point they both end up with exactly the same set of keys. I now need to update all values of vMap based on the corresponding iMap value. The first thing that comes to mind is something this:
typedef map<string, int> map_t;
BOOST_FOREACH(map_t::value_type &p, iMap) {
vMap[p.first].push_back(p.second);
}
However, it seems rather wasteful that we have to lookup each value of vMap[n] considering we're effectively going through the keys in order. Is there any way we can take advantage of this?
If you're absolutely sure that the keys are identical, you can iterate over both maps in lockstep:
auto it1 = iMap.begin();
auto it2 = vMap.begin();
while (it1 != iMap.end())
{
it2->second.push_back(*it1);
++it1;
++it2;
}

predicate for a map from string to int

I have this small program that reads a line of input & prints the words in it, with their respective number of occurrences. I want to sort the elements in the map that stores these values according to their occurrences. I mean, the words that only appear once, will be ordered to be at the beginning, then the words that appeared twice 7 so on. I know that the predicate should return a bool value, but I don't know what the parameters should be. Should it be two iterators to the map? If some one could explain this, it would be greatly appreciated. Thank you in advance.
#include<iostream>
#include<map>
using std::cout;
using std::cin;
using std::endl;
using std::string;
using std::map;
int main()
{
string s;
map<string,int> counters; //store each word & an associated counter
//read the input, keeping track of each word & how often we see it
while(cin>>s)
{
++counters[s];
}
//write the words & associated counts
for(map<string,int>::const_iterator iter = counters.begin();iter != counters.end();iter++)
{
cout<<iter->first<<"\t"<<iter->second<<endl;
}
return 0;
}
std::map is always sorted according to its key. You cannot sort the elements by their value.
You need to copy the contents to another data structure (for example std::vector<std::pair<string, int> >) which can be sorted.
Here is a predicate that can be used to sort such a vector. Note that sorting algorithms in C++ standard library need a "less than" predicate which basically says "is a smaller than b".
bool cmp(std::pair<string, int> const &a, std::pair<string, int> const &b) {
return a.second < b.second;
}
You can't resort a map, it's order is predefined (by default, from std::less on the key type). The easiest solution for your problem would be to create a std::multimap<int, string> and insert your values there, then just loop over the multimap, which will be ordered on the key type (int, the number of occurences), which will give you the order that you want, without having to define a predicate.
You are not going to be able to do this with one pass with an std::map. It can only be sorted on one thing at a time, and you cannot change the key in-place. What I would recommend is to use the code you have now to maintain the counters map, then use std::max_element with a comparison function that compares the second field of each std::pair<string, int> in the map.
A map has its keys sorted, not its values. That's what makes the map efficent. You cannot sort it by occurrences without using another data structure (maybe a reversed index!)
As stated, it simply won't work -- a map always remains sorted by its key value, which would be the strings.
As others have noted, you can copy the data to some other structure, and sort by the value. Another possibility would be to use a Boost bimap instead. I've posted a demo of the basic idea previously.
You probably want to transform map<string,int> to vector<pair<const string, int> > then sort the vector on the int member.
You could do
struct PairLessSecond
{
template< typename P >
bool operator()( const P& pairLeft, const P& pairRight ) const
{
return pairLeft.second < pairRight.second;
}
};
You can probably also construct all this somehow using a lambda with a bind.
Now
std::vector< std::map<std::string,int>::value_type > byCount;
std::sort( byCount.begin(), byCount.end(), PairLessSecond() );