C++ find keys and values in a multimap

C++ find keys and values in a multimap - c++

I want to create a structure that holds distinct strings and assign to each one of them some (not one unique) int values. After I have filled that structure, I want to check for each string how many different int have been assigned to and which exactly are they. I know that it is possible to tackle this with a multimap. However I am not sure if (or how) it is possible to get all the distinct strings contained to the multimap, since the function “find” requires a parameter for matching, while I do not know when searching which distinct values could be in the multimap. How could this be done with a multimap?
As an alternative solution I tried to use a simple map with a vector as value. However I still cannot make this work because the iterator of the vector does not seem to be recognized and it indicates me : iterator must a have a pointer to class type.
map<string, vector<int>>::iterator multit;
int candID1, candID2, candID3;
for(multit=Freq.begin(); multit!=Freq.end(); multit++)
{
if((*multit).second.size()==3)
{
vector<int> vectorWithIds = (*multit).second;
for(vector<int>::iterator it = vectorWithIds.begin();
it != vectorWithIds.end();it++)
{
candID1 = it-> Problem: The iterator is not recognized
}
}
}
Could anyone detect the problem? Is there an attainable solution, either on the first or the second way?

What is it->? It's vector if ints, you probably want *it.
P.S. I have to admit I haven't read the whole prose.

I suggest a multimap<string, int>. Assuming I understood your requirements correctly, with you having "unique" strings and several different values for them. You could use count(key) to see how many values there are for a key and equal_range(key) which returns a pair<interator, iterator> with the first iterator pointing to the start of the range of values for a key and second iterator pointing past the value for key.
See reference

Ok, this is totaly not efficient at all, but you can use std::set initialized with you std::vector to extract only the unique values of std::vector, like in this example:
#include <iostream>
#include <vector>
#include <map>
#include <set>
int main() {
// some data
std::string keys[] = {"first", "second", "third"};
int values[] = {1, 2, 1, 3, 4, 2, 2, 4, 9};
// initial data structures
std::vector<std::string> words(keys, keys + sizeof(keys) / sizeof(std::string));
std::vector<int> numbers(values, values + sizeof(values) / sizeof(int));
// THE map
std::map< std::string, std::vector<int> > dict;
// inserting data into the map
std::vector<std::string>::iterator itr;
for(itr = words.begin(); itr != words.end(); itr++) {
dict.insert(std::pair< std::string, std::vector<int> > (*itr, numbers));
} // for
// SOLUTION
// count unique values for the key of std::map<std::string, std::vector<int> >
std::map<std::string, std::vector<int> >::iterator mtr;
for(mtr = dict.begin(); mtr != dict.end(); mtr++) {
std::set<int> unique((*mtr).second.begin(), (*mtr).second.end());
std::cout << unique.size() << std::endl;
} // for
return 0;
} // main

Related

How to efficiently delete a key from an std::unordered_set if the key is not present in an std::map?

I have an std::map and an std::unordered_set with the same key.
I want to remove all keys from the set that do not exist in the map.
My idea was to do something like the following:
#include <map>
#include <unordered_set>
int main()
{
std::map<uint64_t, std::string> myMap = {
{1, "foo"},
{2, "bar"},
{3, "morefoo"},
{4, "morebar"}
};
std::unordered_set<uint64_t> mySet = { 1, 2, 3, 4, 123 };
for (const auto key : mySet)
{
const auto mapIterator = myMap.find(key);
if (myMap.end() == mapIterator)
mySet.erase(key);
}
return 0;
}
And then invoke it on a timer callback every few seconds, however, the snippet above throws an assert when trying to delete the key 123 from mySet, stating:
List iterator is not incrementable.
The thing is, even if it didn't throw an exception I feel like this idea is far from elegant/optimal. I'm wondering if there is a better way to approach this problem?
Thanks.

As stated in answers to this question How to remove from a map while iterating it? you cannot erase elements in container while iterating over it in a for range loop, so your code should use iterator. As such answer already provided I would not duplicate it.
For efficiency you have a quite wrong approach - you are doing lookup in std::map while iterating std::unordered_set, which should be opposite as std::unorederd_set provides faster lookup time (otherwise there is not point for you use it instead of std::set). One of possible approach is to create a copy:
auto copySet = mySet;
for( const auto &p : myMap )
copySet.erase( p.first );
for( const auto v : copySet )
mySet.erase( v );
which could be more efficient, but that depends on your data. Better approach to choose proper data types for your containers.
Note: by wrong approach I mean efficiency only for this particular situation presented in your question, but this seem to be a part of a larger program and this solution can be right for it as there could be more important cases when this data structure works well.

As stated in the comments
for (const auto key : mySet)
{
const auto mapIterator = myMap.find(key);
if (myMap.end() == mapIterator)
mySet.erase(key);
}
will have undefined behavior. When you erase element 123 from mySet you invalidate the iterator that the range based for loop is using. incrementing that iterator is not allowed after you do that. What you can do is switch to a regular for loop so you can control when the iterator is incremented like
for (auto it = mySet.begin(); it != mySet.end();)
{
if (myMap.find(*it) == myMap.end())
it = mySet.erase(it);
else
++it;
}
and now you always have a valid iterator as erase will return the next valid iterator.

Vector as key value pair in Hash map

I'm trying to create a hash_map in C++ with one of the key-value pair of type std::vector. What I'm not getting is how to insert multiple values in vector part of the hash-table?
hash_map<string, int> hm;
hm.insert(make_pair("one", 1));
hm.insert(make_pair("three", 2));
The above example is a simple way of using hash map without vector as a key-pair value.
The example below uses Vector. I am trying to add multiple int values for each corresponding string value, e.g. => "one" & (1,2,3) instead of "one" & (1).
hash_map<string, std::vector<int>> hm;
hm.insert(make_pair("one", ?)); // How do I insert values in both the vector as well as hash_map
hm.insert(make_pair("three", ?)); // How do I insert values in both the vector as well as hash_map
If you're wondering why use vectors here, basically I'm trying to add multiple values instead of a single int value foreach corresponding string value.

hash_map<string, std::vector<int>> hm;
hm.insert(make_pair("one", vector<int>{1,2,3})); // How do I insert values in both the vector as well as hash_map
hm.insert(make_pair("three", vector<int>{4,5,6}));

You can do the following:
std::unordered_map<std::string, std::vector<int>> hm;
hm.emplace("one", std::vector<int>{ 1, 2, 3 });
If you want to add to it later you can perform:
hm["one"].push_back(4);

here compiled
#include <iostream>
#include <string>
#include <unordered_map>
#include <vector>
int main()
{
std::unordered_map<std::string, std::vector<int> > hm;
hm["one"]={1,2,3};
hm["two"]={5,6,7};
for (const auto&p : hm)
{
std::cout<< p.first << ": ";
for (const auto &i : p.second)
std::cout<< i << ", ";
std::cout<< std::endl;
}
}
This output:
two: 5, 6, 7,
one: 1, 2, 3,
The previous answers are basically right (I just didn't tested). In the core they use a vector constructor which take an initialization list which is the only way to directly create the vector enumerating the values. Nevertheless, I wanted to show what I think is a better way to do what you actually want - to set a new value for a given string key.
The operator[string] for this container return a reference for a corresponding value, here vector<int>. If the key is new it first create a new value (vector) too, and insert that pair. Then, the operator= of the vector<int> will assign from the initialization list. I would said you should use the other variants over this direct variant only if you find a serious reason not to use this, because this is more idiomatic and far more direct.

How to preserve insertion order in Map? [duplicate]

I currently have a std::map<std::string,int> that stores an integer value to a unique string identifier, and I do look up with the string. It does mostly what I want, except that it does not keep track of the insertion order. So when I iterate the map to print out the values, they are sorted according to the string; but I want them to be sorted according to the order of (first) insertion.
I thought about using a vector<pair<string,int>> instead, but I need to look up the string and increment the integer values about 10,000,000 times, so I don't know whether a std::vector will be significantly slower.
Is there a way to use std::map or is there another std container that better suits my need?
I'm on GCC 3.4, and I have probably no more than 50 pairs of values in my std::map.

If you have only 50 values in std::map you could copy them to std::vector before printing out and sort via std::sort using appropriate functor.
Or you could use boost::multi_index. It allows to use several indexes.
In your case it could look like the following:
struct value_t {
string s;
int i;
};
struct string_tag {};
typedef multi_index_container<
value_t,
indexed_by<
random_access<>, // this index represents insertion order
hashed_unique< tag<string_tag>, member<value_t, string, &value_t::s> >
>
> values_t;

You might combine a std::vector with a std::tr1::unordered_map (a hash table). Here's a link to Boost's documentation for unordered_map. You can use the vector to keep track of the insertion order and the hash table to do the frequent lookups. If you're doing hundreds of thousands of lookups, the difference between O(log n) lookup for std::map and O(1) for a hash table might be significant.
std::vector<std::string> insertOrder;
std::tr1::unordered_map<std::string, long> myTable;
// Initialize the hash table and record insert order.
myTable["foo"] = 0;
insertOrder.push_back("foo");
myTable["bar"] = 0;
insertOrder.push_back("bar");
myTable["baz"] = 0;
insertOrder.push_back("baz");
/* Increment things in myTable 100000 times */
// Print the final results.
for (int i = 0; i < insertOrder.size(); ++i)
{
const std::string &s = insertOrder[i];
std::cout << s << ' ' << myTable[s] << '\n';
}

Tessil has a very nice implementaion of ordered map (and set) which is MIT license. You can find it here: ordered-map
Map example
#include <iostream>
#include <string>
#include <cstdlib>
#include "ordered_map.h"
int main() {
tsl::ordered_map<char, int> map = {{'d', 1}, {'a', 2}, {'g', 3}};
map.insert({'b', 4});
map['h'] = 5;
map['e'] = 6;
map.erase('a');
// {d, 1} {g, 3} {b, 4} {h, 5} {e, 6}
for(const auto& key_value : map) {
std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
}
map.unordered_erase('b');
// Break order: {d, 1} {g, 3} {e, 6} {h, 5}
for(const auto& key_value : map) {
std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
}
}

Keep a parallel list<string> insertionOrder.
When it is time to print, iterate on the list and do lookups into the map.
each element in insertionOrder // walks in insertionOrder..
print map[ element ].second // but lookup is in map

If you need both lookup strategies, you will end up with two containers. You may use a vector with your actual values (ints), and put a map< string, vector< T >::difference_type> next to it, returning the index into the vector.
To complete all that, you may encapsulate both in one class.
But I believe boost has a container with multiple indices.

What you want (without resorting to Boost) is what I call an "ordered hash", which is essentially a mashup of a hash and a linked list with string or integer keys (or both at the same time). An ordered hash maintains the order of the elements during iteration with the absolute performance of a hash.
I've been putting together a relatively new C++ snippet library that fills in what I view as holes in the C++ language for C++ library developers. Go here:
https://github.com/cubiclesoft/cross-platform-cpp
Grab:
templates/detachable_ordered_hash.cpp
templates/detachable_ordered_hash.h
templates/detachable_ordered_hash_util.h
If user-controlled data will be placed into the hash, you might also want:
security/security_csprng.cpp
security/security_csprng.h
Invoke it:
#include "templates/detachable_ordered_hash.h"
...
// The 47 is the nearest prime to a power of two
// that is close to your data size.
//
// If your brain hurts, just use the lookup table
// in 'detachable_ordered_hash.cpp'.
//
// If you don't care about some minimal memory thrashing,
// just use a value of 3. It'll auto-resize itself.
int y;
CubicleSoft::OrderedHash<int> TempHash(47);
// If you need a secure hash (many hashes are vulnerable
// to DoS attacks), pass in two randomly selected 64-bit
// integer keys. Construct with CSPRNG.
// CubicleSoft::OrderedHash<int> TempHash(47, Key1, Key2);
CubicleSoft::OrderedHashNode<int> *Node;
...
// Push() for string keys takes a pointer to the string,
// its length, and the value to store. The new node is
// pushed onto the end of the linked list and wherever it
// goes in the hash.
y = 80;
TempHash.Push("key1", 5, y++);
TempHash.Push("key22", 6, y++);
TempHash.Push("key3", 5, y++);
// Adding an integer key into the same hash just for kicks.
TempHash.Push(12345, y++);
...
// Finding a node and modifying its value.
Node = TempHash.Find("key1", 5);
Node->Value = y++;
...
Node = TempHash.FirstList();
while (Node != NULL)
{
if (Node->GetStrKey()) printf("%s => %d\n", Node->GetStrKey(), Node->Value);
else printf("%d => %d\n", (int)Node->GetIntKey(), Node->Value);
Node = Node->NextList();
}
I ran into this SO thread during my research phase to see if anything like OrderedHash already existed without requiring me to drop in a massive library. I was disappointed. So I wrote my own. And now I've shared it.

Here is solution that requires only standard template library without using boost's multiindex:
You could use std::map<std::string,int>; and vector <data>; where in map you store the index of the location of data in vector and vector stores data in insertion order. Here access to data has O(log n) complexity. displaying data in insertion order has O(n) complexity. insertion of data has O(log n) complexity.
For Example:
#include<iostream>
#include<map>
#include<vector>
struct data{
int value;
std::string s;
}
typedef std::map<std::string,int> MapIndex;//this map stores the index of data stored
//in VectorData mapped to a string
typedef std::vector<data> VectorData;//stores the data in insertion order
void display_data_according_insertion_order(VectorData vectorData){
for(std::vector<data>::iterator it=vectorData.begin();it!=vectorData.end();it++){
std::cout<<it->value<<it->s<<std::endl;
}
}
int lookup_string(std::string s,MapIndex mapIndex){
std::MapIndex::iterator pt=mapIndex.find(s)
if (pt!=mapIndex.end())return it->second;
else return -1;//it signifies that key does not exist in map
}
int insert_value(data d,mapIndex,vectorData){
if(mapIndex.find(d.s)==mapIndex.end()){
mapIndex.insert(std::make_pair(d.s,vectorData.size()));//as the data is to be
//inserted at back
//therefore index is
//size of vector before
//insertion
vectorData.push_back(d);
return 1;
}
else return 0;//it signifies that insertion of data is failed due to the presence
//string in the map and map stores unique keys
}

You cannot do that with a map, but you could use two separate structures - the map and the vector and keep them synchronized - that is when you delete from the map, find and delete the element from the vector. Or you could create a map<string, pair<int,int>> - and in your pair store the size() of the map upon insertion to record position, along with the value of the int, and then when you print, use the position member to sort.

One thing you need to consider is the small number of data elements you are using. It is possible that it will be faster to use just the vector. There is some overhead in the map that can cause it to be more expensive to do lookups in small data sets than the simpler vector. So, if you know that you will always be using around the same number of elements, do some benchmarking and see if the performance of the map and vector is what you really think it is. You may find the lookup in a vector with only 50 elements is near the same as the map.

Another way to implement this is with a map instead of a vector. I will show you this approach and discuss the differences:
Just create a class that has two maps behind the scenes.
#include <map>
#include <string>
using namespace std;
class SpecialMap {
// usual stuff...
private:
int counter_;
map<int, string> insertion_order_;
map<string, int> data_;
};
You can then expose an iterator to iterator over data_ in the proper order. The way you do that is iterate through insertion_order_, and for each element you get from that iteration, do a lookup in the data_ with the value from insertion_order_
You can use the more efficient hash_map for insertion_order since you don't care about directly iterating through insertion_order_.
To do inserts, you can have a method like this:
void SpecialMap::Insert(const string& key, int value) {
// This may be an over simplification... You ought to check
// if you are overwriting a value in data_ so that you can update
// insertion_order_ accordingly
insertion_order_[counter_++] = key;
data_[key] = value;
}
There are a lot of ways you can make the design better and worry about performance, but this is a good skeleton to get you started on implementing this functionality on your own. You can make it templated, and you might actually store pairs as values in data_ so that you can easily reference the entry in insertion_order_. But I leave these design issues as an exercise :-).
Update: I suppose I should say something about efficiency of using map vs. vector for insertion_order_
lookups directly into data, in both cases are O(1)
inserts in the vector approach are O(1), inserts in the map approach are O(logn)
deletes in the vector approach are O(n) because you have to scan for the item to remove. With the map approach they are O(logn).
Maybe if you are not going to use deletes as much, you should use the vector approach. The map approach would be better if you were supporting a different ordering (like priority) instead of insertion order.

This is somewhat related to Faisals answer. You can just create a wrapper class around a map and vector and easily keep them synchronized. Proper encapsulation will let you control the access method and hence which container to use... the vector or the map. This avoids using Boost or anything like that.

// Should be like this man!
// This maintains the complexity of insertion is O(logN) and deletion is also O(logN).
class SpecialMap {
private:
int counter_;
map<int, string> insertion_order_;
map<string, int> insertion_order_reverse_look_up; // <- for fast delete
map<string, Data> data_;
};

There is no need to use a separate std::vector or any other container for keeping track of the insertion order. You can do what you want as shown below.
If you want to keep the insertion order then you can use the following program(version 1):
Version 1: For counting unique strings using std::map<std::string,int> in insertion order
#include <iostream>
#include <map>
#include <sstream>
int findExactMatchIndex(const std::string &totalString, const std::string &toBeSearched)
{
std::istringstream ss(totalString);
std::string word;
std::size_t index = 0;
while(ss >> word)
{
if(word == toBeSearched)
{
return index;
}
++index;
}
return -1;//return -1 when the string to be searched is not inside the inputString
}
int main() {
std::string inputString = "this is a string containing my name again and again and again ", word;
//this map maps the std::string to their respective count
std::map<std::string, int> wordCount;
std::istringstream ss(inputString);
while(ss >> word)
{
//std::cout<<"word:"<<word<<std::endl;
wordCount[word]++;
}
std::cout<<"Total unique words are: "<<wordCount.size()<<std::endl;
std::size_t i = 0;
std::istringstream gothroughStream(inputString);
//just go through the inputString(stream) instead of map
while( gothroughStream >> word)
{
int index = findExactMatchIndex(inputString, word);
if(index != -1 && (index == i)){
std::cout << word <<"-" << wordCount.at(word)<<std::endl;
}
++i;
}
return 0;
}
The output of the above program is as follows:
Total unique words are: 9
this-1
is-1
a-1
string-1
containing-1
my-1
name-1
again-3
and-2
Note that in the above program, if you have a comma or any other delimiter then it is counted as a separate word. So for example lets say you have the string this is, my name is then the string is, has count of 1 and the string is has count of 1. That is is, and is are different. This is because the computer doesn't know our definition of a word.
Note
The above program is a modification of my answer to How do i make the char in an array output in order in this nested for loop? which is given as version 2 below:
Version 2: For counting unique characters using std::map<char, int> in insertion order
#include <iostream>
#include <map>
int main() {
std::string inputString;
std::cout<<"Enter a string: ";
std::getline(std::cin,inputString);
//this map maps the char to their respective count
std::map<char, int> charCount;
for(char &c: inputString)
{
charCount[c]++;
}
std::size_t i = 0;
//just go through the inputString instead of map
for(char &c: inputString)
{
std::size_t index = inputString.find(c);
if(index != inputString.npos && (index == i)){
std::cout << c <<"-" << charCount.at(c)<<std::endl;
}
++i;
}
return 0;
}
In both cases/versions there is no need to use a separate std::vector or any other container to keep track of the insertion order.

Use boost::multi_index with map and list indices.

A map of pair (str,int) and static int that increments on insert calls indexes pairs of data. Put in a struct that can return the static int val with an index () member perhaps?

Check for common members in vector c++

What is the best way to verify if there are common members within multiple vectors?
The vectors aren't necessarily of equal size and they may contain custom data (such as structures containing two integers that represent a 2D coordinate).
For example:
vec1 = {(1,2); (3,1); (2,2)};
vec2 = {(3,4); (1,2)};
How to verify that both vectors have a common member?
Note that I am trying to avoid inneficient methods such as going through all elements and check for equal data.

For non-trivial data sets, the most efficient method is probably to sort both vectors, and then use std::set_intersection function defined in , like follows:
#include <vector>
#include <algorithm>
using namespace std;
typedef vector<pair<int, int>> tPointVector;
tPointVector vec1 {{1,2}, {3,1}, {2,2}};
tPointVector vec2 {{3,4}, {1,2}};
std::sort(begin(vec1), end(vec1));
std::sort(begin(vec2), end(vec2));
tPointVector vec3;
vec3.reserve(std::min(vec1.size(), vec2.size()));
set_intersection(begin(vec1), end(vec1), begin(vec2), end(vec2), back_inserter(vec3));
You may get better performance with a nonstandard algorithm if you do not need to know which elements are different, but only the number of common elements, because then you can avoid having to create new copies of the common elements.
In any case, it seems to me that starting by sorting both containers will give you the best performance for data sets with more than a few dozen elements.
Here's an attempt at writing an algorithm that just gives you the count of matching elements (untested):
auto it1 = begin(vec1);
auto it2 = begin(vec2);
const auto end1 = end(vec1);
const auto end2 = end(vec2);
sort(it1, end1);
sort(it2, end2);
size_t numCommonElements = 0;
while (it1 != end1 && it2 != end2) {
bool oneIsSmaller = *it1 < *it2;
if (oneIsSmaller) {
it1 = lower_bound(it1, end1, *it2);
} else {
bool twoIsSmaller = *it2 < *it1;
if (twoIsSmaller) {
it2 = lower_bound(it2, end2, *it1);
} else {
// none of the elements is smaller than the other
// so it's a match
++it1;
++it2;
++numCommonElements;
}
}
}

Note that I am trying to avoid inneficient methods such as going through all elements and check for equal data.
You need to go through all elements at least once, I assume you're implying you don't want to check every combinations. Indeed you don't want to do :
for all elements in vec1, go through the entire vec2 to check if the element is here. This won't be efficient if your vectors have a big number of elements.
If you prefer a linear time solution and you don't mind using extra memory here is what you can do :
You need a hashing function to insert element in an unordered_map or unordered_set
See https://stackoverflow.com/a/13486174/2502814
// next_permutation example
#include <iostream> // std::cout
#include <unordered_set> // std::unordered_set
#include <vector> // std::vector
using namespace std;
namespace std {
template <>
struct hash<pair<int, int>>
{
typedef pair<int, int> argument_type;
typedef std::size_t result_type;
result_type operator()(const pair<int, int> & t) const
{
std::hash<int> int_hash;
return int_hash(t.first + 6495227 * t.second);
}
};
}
int main () {
vector<pair<int, int>> vec1 {{1,2}, {3,1}, {2,2}};
vector<pair<int, int>> vec2 {{3,4}, {1,2}};
// Copy all elements from vec2 into an unordered_set
unordered_set<pair<int, int>> in_vec2;
in_vec2.insert(vec2.begin(),vec2.end());
// Traverse vec1 and check if elements are here
for (auto& e : vec1)
{
if(in_vec2.find(e) != in_vec2.end()) // Searching in an unordered_set is faster than going through all elements of vec2 when vec2 is big.
{
//Here are the elements in common:
cout << "{" << e.first << "," << e.second << "} is in common!" << endl;
}
}
return 0;
}
Output : {1,2} is in common!
You can either do that, or copy all elements of vec1 into an unordered_set, and then traverse vec2.
Depending on the sizes of vec1 and vec2, one solution might be faster than the other.
Keep in mind that picking the smaller vector to insert in the unordered_set also means you will use less extra memory.

I believe you use a 2D tree to search in 2 dimenstions. An optimal algorithm to the problem you specified would fall under the class of geometric algorithms. Maybe this link is of use to you: http://www.cs.princeton.edu/courses/archive/fall05/cos226/lectures/geosearch.pdf .

std::sort that also keeps track of number of unique entries at each level

Say I have a std::vector. Say the vectors contain numbers. Let's take this std::vector
1,3,5,4,3,4,5,1,6,3
std::sort<std::less<int>> will sort this into
1,1,3,3,3,4,4,5,5,6,
How would I ammend sort so that at the same time it is sorting, it also computes the quantity of numbers at the same level. So say in addition to sorting, it would also compile the following dictionary [level is also int]
std::map<level, int>
<1, 2>
<2, 3>
<3, 2>
<4, 2>
<5, 1>
<6, 1>
so there are 2 1's, 3 3's, 2 4's, and so on.
The reason I [think] I need this is because I don't want to sort the vector, THEN once again, compute the number of duplicates at each level. It seems faster to do it both in one pass?
Thank you all! bjskishore123 is the closest thing to what I was asking, but all the responses educated me. Thanks again.

As stated by #bjskishore123, you can use a map to guarantee the correct order of your set. As a bonus, you will have an optimized strucutre to search (the map, of course).
Inserting/searching in a map takes O(log(n)) time, while traversing the vector is O(n). So, the alghorithm is O(n*log(n)). Wich is the same complexity as any sort algorithm that needs to compare elements: merge sort or quick sort, for example.
Here is a sample code for you:
int tmp[] = {5,5,5,5,5,5,2,2,2,2,7,7,7,7,1,1,1,1,6,6,6,2,2,2,8,8,8,5,5};
std::vector<int> values(tmp, tmp + sizeof(tmp) / sizeof(tmp[0]));
std::map<int, int> map_values;
for_each(values.begin(), values.end(), [&](int value)
{
map_values[value]++;
});
for(std::map<int, int>::iterator it = map_values.begin(); it != map_values.end(); it++)
{
std::cout << it->first << ": " << it->second << "times";
}
Output:
1: 4times
2: 7times
5: 8times
6: 3times
7: 4times
8: 3times

I don't think you can do this in one pass. Let's say you provide your own custom comparator for sorting which somehow tries to count the duplicates.
However the only thing you can capture in the sorter is the value(maybe reference but doesn't matter) of the current two elements being compared. You have no other information because std::sort doesn't pass any thing else to the sorter.
Now the way std::sort works it will keep swapping elements until they reach the proper location in the sorted vector. That means a single member can be sent to the sorter multiple times making it impossible to count exactly. You can count how many times a certain element and all others value equal to it have been moved but not exactly how many of them are in there.

Instead of using a vector,
While storing number one by one, Use std::multiset container
It stores internally in sorted order.
While storing each number, use a map to keep track of the number of occurrences of each number.
map<int, int> m;
Each time a number is added do
m[num]++;
So, no need of another pass to calculate the number of occurrences, although you need to iterate in map to get each occurrence count.
=============================================================================
THE FOLLOWING IS AN ALTERNATE SOLUTION WHICH IS NOT RECOMMENDED .
GIVING IT AS YOU ASKED A WAY WHICH USES STD::SORT.
Below code makes use of comparison function to count the occurrences.
#include <iostream>
#include <map>
#include <vector>
#include <algorithm>
using namespace std;
struct Elem
{
int index;
int num;
};
std::map<int, int> countMap; //Count map
std::map<int, bool> visitedMap;
bool compare(Elem a, Elem b)
{
if(visitedMap[a.index] == false)
{
visitedMap[a.index] = true;
countMap[a.num]++;
}
if(visitedMap[b.index] == false)
{
visitedMap[b.index] = true;
countMap[b.num]++;
}
return a.num < b.num;
}
int main()
{
vector<Elem> v;
Elem e[5] = {{0, 10}, {1, 20}, {2, 30}, {3, 10}, {4, 20} };
for(size_t i = 0; i < 5; i++)
v.push_back(e[i]);
std::sort(v.begin(), v.end(), compare);
for(map<int, int>::iterator it = countMap.begin(); it != countMap.end(); it++)
cout<<"Element : "<<it->first<<" occurred "<<it->second<<" times"<<endl;
}
Output:
Element : 10 occurred 2 times
Element : 20 occurred 2 times
Element : 30 occurred 1 times

If you have lots of duplicates, the fastest way to accomplish this task is probably to first count duplicates using a hash map, which is O(n), and then to sort the map, which is O(m log m) where m is the number of unique values.
Something like this (in c++11):
#include <algorithm>
#include <unordered_map>
#include <utility>
#include <vector>
std::vector<std::pair<int, int>> uniqsort(const std::vector<int>& v) {
std::unordered_map<int, int> count;
for (auto& val : v) ++count[val];
std::vector<std::pair<int, int>> result(count.begin(), count.end());
std::sort(result.begin(), result.end());
return result;
}
There are lots of variations on the theme, depending on what you need, precisely. For example, perhaps you don't even need the result to be sorted; maybe it's enough to just have the count map. Or maybe you would prefer the result to be a sorted map from int to int, in which case you could just build a regular std::map, instead. (That would be O(n log m).) Or maybe you know something about the values which make them faster to sort (like the fact that they are small integers in a known range.) And so on.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ find keys and values in a multimap - c++

What is it->? It's vector if ints, you probably want *it. P.S. I have to admit I haven't read the whole prose.

Related

How to efficiently delete a key from an std::unordered_set if the key is not present in an std::map?

Vector as key value pair in Hash map

How to preserve insertion order in Map? [duplicate]

Check for common members in vector c++

std::sort that also keeps track of number of unique entries at each level

Categories

Resources