Erase list element from unordered_set - c++

I have a list<pair<int , double>> lSeedList and an unordered_set<int> sToDelete. I want to remove the pairs in the list that have their first member equal to an int in sToDelete. Currently I am using the following code :
void updateSL(list<pair<int, double> >& lSeedList, const unordered_set<int>& sAddedFacets)
{
list<pair<int, double> >::iterator it = lSeedList.begin();
while(it != lSeedList.end())
{
if(sAddedFacets.count(it->first) != 0)
it = lSeedList.erase(it);
else
++it;
}
}
Is there a way to speed up this code ? Is it possible to efficiently parallelize it with OpenMP (dividing the list in each thread and then merging them with splice) ?
I am using Visual Studio 2010 under Windows 7. The size of lSeedList is ~1 million at the start and the size of sToDelete is ~10000. The int in the pair acts like an unique ID.

It is better to use either standard algorithm std::remove_if
For exameple
lSeedList.erase( std::remove_if( lSeedList.begin(), lSeedList.end(),
[&]( const std::pair<int, double> &p )
{
return sAddedFacets.count( p.first );
} ),
lSeedList.end() );
Or member function remove_if of class std::list
For example
lSeedList.remove_if( [&]( const std::pair<int, double> &p )
{
return sAddedFacets.count( p.first );
} );

Related

How can i erase duplicated elements from a multimap<int, std::pair<int, bool>>?

I have a multimap with duplicates. When I finished the collection of the elements i would like to erase the dups.
Here is the container:
std::multimap<int, std::pair<int, bool>> container;
This following code is inside an iteration.(it is a simpler version of the original)
container.emplace(LeafId, std::make_pair(NodeId, isElectronic));
Is it good solution?
std::pair<int, std::pair<int, bool>> lastValue {-1 , {-1, -1}};
for (auto it = container.cbegin(); it != container.cend();)
{
if (it->first == lastValue.first && it->second == lastValue.second)
{
it = container.erase(it);
} else
{
lastValue = *it;
++it;
}
}
Is it good solution?
Unless you keep internal pair sorted inside multimap, no it is not a good solution as it would miss duplicates. If you can change data type then you can use:
std::map<int, std::vector<std::pair<int, bool>>>
instead of std::multimap and then for each element sort vector and remove duplicates using standard algorithms as described here Removing duplicates in a vector of strings
if you cannot I suggest to use additional std::set or std::unordered_set:
std::set<std::pair<int, bool>> tset;
int lastValue = 0;
for (auto it = container.cbegin(); it != container.cend();)
{
if( it->first != lastValue ) {
tset.clear();
lastValue = it->first;
}
if( !tset.insert( it->second ).second )
it = container.erase( it );
else
++it;
}

In a map how to collect keys of the same value into a vector?

I want to collect keys of the same value in a map. What is the easiest way to do it using vector? That means all the keys having the same value can be collected in a vector.
You will have to do a linear search over the whole container, which is O(N).
std::vector<Value> values;
std::for_each(map.begin(), map.end(),
[&](std::map<Key,Value>::value_type const & x) {
if (x.second == value)
values.push_back(x.first);
});
If you want to extract all keys for which the value is not unique, the complexity of the code is higher, and you will need additional data, but you could do something like this:
std::map<Value, std::pair<Key, bool>> tracker;
// Maps a 'Value' to the first 'Key' that had it, and a 'bool'
// identifying if it has already been inserted into the vector.
std::vector<Key> keys;
for_each(m.begin(), m.end(),
[](std::map<Key, Value>::value_type const& x) {
auto r = tracker.insert(std::make_pair(x.second,
std::make_pair(x.first, false));
if (!r.second) {
// Not the first time we saw this value
if (!r.first->second) {
// First key not already inserted, insert now and update flag
keys.push_back(r.first);
r.first->second = true;
}
keys.push_back(x.first);
}
});
Although in real code I would avoid using std::pair and would create a named type that makes the code simpler to read. In the code above it is not obvious what all those first and second mean…
A different alternative, probably more efficient (measure and profile) would be to use transform to create a vector where the elements are swapped and then iterate over that vector extracting the values of interest.
You can do it the following way
#include <iostream>
#include <vector>
#include <string>
#include <map>
int main()
{
std::map<int, std::string> m
{
{ 1, "Monday" }, { 2, "Tuesday" }, { 9, "Monday" }
};
std::vector<int> v;
size_t n = 0;
std::string s( "Monday" );
for ( const auto &p : m )
{
if ( p.second == s ) ++n;
}
v.reserve( n );
for ( const auto &p : m )
{
if ( p.second == s ) v.push_back( p.first );
}
for ( const auto &x : v ) std::cout << x << ' ';
std::cout << std::endl;
return 0;
}
The output is
1 9
You can substitute the range based for statements for correspondingly std::count_if and std::for_each algorithms along with lambda expressions. But in my opiniion for this simple task it is better to use the range based for statements.
Where you are creating the map, consider creating a unordered_multimap with the key and value of the original map swapped.

Implementation of lower_bound on vector pairs

I know we need to include some compare function in order to achieve this.
But not able to write for this one.
For example:
Elements of vector={(2,4),(4,2),(5,1),(5,3)}
to find=5
lower_bound() should return 2
code->
#define pp pair<int,int>
bool cmp(const pp &l,const pp &r) {
return l.first < r.first;
}
int main() {
vector<pp> v;
sort(v.begin(), v.end(), cmp);
int id=(int)(lower_bound(v.begin(), v.end(), ??) - v.begin());
}
Pairs (just like tuples) compare lexicographically anyway. You don't need to define any special comparators for this.
And since you're using lower_bound you'll be searching for the first element that does not compare less than the val you're searching, so you should use a min value as the second pair element. To sum up, all can be done in "two" lines of code :
sort(v.begin(),v.end());
auto id = distance(v.begin(), lower_bound(v.begin(),v.end(),
make_pair(5, numeric_limits<int>::min())) );
Some Notes :
Use std::distance to calculate the number of elements between two iterators
The return type of std::distance is an unsigned type. Unless you need negative indexing (Python like syntax for "count from the end" indexes) it's a good practice to keep your indexes unsigned.
Since you don't care about the second value of pp, just construct a temporary pp object with any value as the second element.
int id = std::lower_bound(v.begin(), v.end(), pp(5, 0), cmp) - v.begin();
I think you should compare the pairs as per definition of lower_bound
So,
typedef pair<int,int> pp;
//...
int id=(int)(lower_bound(v.begin(),v.end(),
pp(5,std::numeric_limits<int>::min())), //Value to compare
[](const pp& lhs, const pp& rhs) // Lambda
{
return lhs < rhs ; // first argument < second
}
) - v.begin()
);
You can use lower_bound on vector of pairs with custom compare operator .
You need to pass four arguments in that case like this :-
it1 = iterator position from where to search
it2 = iterator position till where to search
lower_bound (it1 ,it2 , finding_element, your_comparator )
auto myComp = [&](pair<int,string> e1, pair<int,string> e2) {
if(e1.second!=e2.second)
return e1.second<e2.second;
else
return e1.first<e2.first;
};
void Show_sample_code()
{
vector<pair<int,string>> data={{1, "sahil"}, {2, "amin"}};
sort(data.begin(), data.end(), myComp);
pair<int, string> p={1,"sahil"};
auto it=lower_bound( data.begin(), data.end(), p, myComp ) ;
if(it!=data.end())
cout<<"found at index="<<distance(data.begin(), it)<<endl;
else
cout<<"notfound"<<endl;
return;
}

How can i get the top n keys of std::map based on their values?

How can i get the top n keys of std::map based on their values?
Is there a way that i can get a list of say for example the top 10 keys with the biggest value as their values?
Suppose we have a map similar to this :
mymap["key1"]= 10;
mymap["key2"]= 3;
mymap["key3"]= 230;
mymap["key4"]= 15;
mymap["key5"]= 1;
mymap["key6"]= 66;
mymap["key7"]= 10;
And i only want to have a list of top 10 keys which has a bigger value compared to the other.
for example the top 4 for our mymap is
key3
key6
key4
key1
key10
note:
the values are not unique, actually they are the number of occurrences of each key. and i want to get a list of most occurred keys
note 2:
if map is not a good candidate and you want to suggest anything, please do it according to the c++11 ,i cant use boost at the time.
note3:
in case of using std::unordered_multimap<int,wstring> do i have any other choices?
The order of a map is based on its key and not its values and cannot be reordered so it is necessary to iterate over the map and maintain a list of the top ten encountered or as commented by Potatoswatter use partial_sort_copy() to extract the top N values for you:
std::vector<std::pair<std::string, int>> top_four(4);
std::partial_sort_copy(mymap.begin(),
mymap.end(),
top_four.begin(),
top_four.end(),
[](std::pair<const std::string, int> const& l,
std::pair<const std::string, int> const& r)
{
return l.second > r.second;
});
See online demo.
Choosing a different type of container may be more appropriate, boost::multi_index would be worth investigating, which:
... enables the construction of containers maintaining one or more indices with different sorting and access semantics.
#include <iostream>
#include <map>
#include <vector>
#include <algorithm>
#include <string>
using namespace std;
int main(int argc, const char * argv[])
{
map<string, int> entries;
// insert some random entries
for(int i = 0; i < 100; ++i)
{
string name(5, 'A' + (char)(rand() % (int)('Z' - 'A') ));
int number = rand() % 100;
entries.insert(pair<string, int>(name, number));
}
// create container for top 10
vector<pair<string, int>> sorted(10);
// sort and copy with reversed compare function using second value of std::pair
partial_sort_copy(entries.begin(), entries.end(),
sorted.begin(), sorted.end(),
[](const pair<string, int> &a, const pair<string, int> &b)
{
return !(a.second < b.second);
});
cout << endl << "all elements" << endl;
for(pair<string, int> p : entries)
{
cout << p.first << " " << p.second << endl;
}
cout << endl << "top 10" << endl;
for(pair<string, int> p : sorted)
{
cout << p.first << " " << p.second << endl;
}
return 0;
}
Not only does std::map not sort by mapped-to value (such values need not have any defined sorting order), it doesn't allow rearrangement of its elements, so doing ++ map[ "key1" ]; on a hypothetical structure mapping the values back to the keys would invalidate the backward mapping.
Your best bet is to put the key-value pairs into another structure, and sort that by value at the time you need the backward mapping. If you need the backward mapping at all times, you would have to remove, modify, and re-add each time the value is changed.
The most efficient way to sort the existing map into a new structure is std::partial_sort_copy, as (just now) illustrated by Al Bundy.
since the mapped values are not indexed, you would have to read everything and select the 10 biggest values.
std::vector<mapped_type> v;
v.reserve(mymap.size());
for(const auto& Pair : mymap)
v.push_back( Pair.second );
std::sort(v.begin(), v.end(), std::greater<mapped_type>());
for(std::size_t i = 0, n = std::min<int>(10,v.size()); i < n; ++i)
std::cout << v[i] << ' ';
another way, is to use two maps or a bimap, thus mapped values would be ordered.
The algorithm you're looking for is nth_element, which partially sorts a range so that the nth element is where it would be in a fully sorted range. For example, if you wanted the top three items in descending order, you'd write (in pseudo C++)
nth_element(begin, begin + 3, end, predicate)
The problem is nth_element doesn't work with std::map. I would therefore suggest you change your data structure to a vector of pairs (and depending on the amount of data you're dealing with, you may find this to be a quicker data structure anyway). So, in the case of your example, I'd write it like this:
typedef vector<pair<string, int>> MyVector;
typedef MyVector::value_type ValueType;
MyVector v;
// You should use an initialization list here if your
// compiler supports it (mine doesn't...)
v.emplace_back(ValueType("key1", 10));
v.emplace_back(ValueType("key2", 3));
v.emplace_back(ValueType("key3", 230));
v.emplace_back(ValueType("key4", 15));
v.emplace_back(ValueType("key5", 1));
v.emplace_back(ValueType("key6", 66));
v.emplace_back(ValueType("key7", 10));
nth_element(v.begin(), v.begin() + 3, v.end(),
[](ValueType const& x, ValueType const& y) -> bool
{
// sort descending by value
return y.second < x.second;
});
// print out the top three elements
for (size_t i = 0; i < 3; ++i)
cout << v[i].first << ": " << v[i].second << endl;
#include "stdafx.h"
#include <iostream>
#include <vector>
#include <map>
#include <string>
#include <algorithm>
#include <cassert>
#include <iterator>
using namespace std;
class MyMap
{
public:
MyMap(){};
void addValue(string key, int value)
{
_map[key] = value;
_vec.push_back(make_pair(key, value));
sort(_vec.begin(), _vec.end(), Cmp());
}
vector<pair<string, int> > getTop(int n)
{
int len = min((unsigned int)n, _vec.size());
vector<Pair> res;
copy(_vec.begin(), _vec.begin() + len, back_inserter(res));
return res;
}
private:
typedef map<string, int> StrIntMap;
typedef vector<pair<string, int> > PairVector;
typedef pair<string, int> Pair;
StrIntMap _map;
PairVector _vec;
struct Cmp:
public binary_function<const Pair&, const Pair&, bool>
{
bool operator()(const Pair& left, const Pair& right)
{
return right.second < left.second;
}
};
};
int main()
{
MyMap mymap;
mymap.addValue("key1", 10);
mymap.addValue("key2", 3);
mymap.addValue("key3", 230);
mymap.addValue("key4", 15);
mymap.addValue("key6", 66);
mymap.addValue("key7", 10);
auto res = mymap.getTop(3);
for_each(res.begin(), res.end(), [](const pair<string, int> value)
{cout<<value.first<<" "<<value.second<<endl;});
}
The simplest solution would be to use std::transform to build
a second map:
typedef std::map<int, std::string> SortedByValue;
SortedByValue map2;
std::transform(
mymap.begin(), mymap.end(),
std::inserter( map2, map2.end() ),
[]( std::pair<std::string, int> const& original ) {
return std::pair<int, std::string>( original.second, original.first );
} );
Then pick off the last n elements of map2.
Alternatively (and probably more efficient), you could use an
std::vector<std::pair<int, std::string>> and sort it
afterwards:
std::vector<std::pair<int, std::string>> map2( mymap.size() );
std::transform(
mymap.begin(), mymap.end()
map2.begin(),
[]( std::pair<std::string, int> const& original ) {
return std::pair<int, std::string>( original.second, original.first );
} );
std::sort( map2.begin(), map2.end() );
(Note that these solutions optimize for time, at the cost of
more memory.)

Accessing for_each iterator from lambda

Is it possible to access the std::for_each iterator, so I can erase the current element from an std::list using a lambda (as below)
typedef std::shared_ptr<IEvent> EventPtr;
std::list<EventPtr> EventQueue;
EventType evt;
...
std::for_each(
EventQueue.begin(), EventQueue.end(),
[&]( EventPtr pEvent )
{
if( pEvent->EventType() == evt.EventType() )
EventQueue.erase( ???Iterator??? );
}
);
I've read about using [](typename T::value_type x){ delete x; } here on SO, but VS2010 doesn't seem to like this statement (underlines T as error source).
You are using the wrong algorithm. Use remove_if:
EventQueue.remove_if([&](EventPtr const& pEvent)
{
return pEvent->EventType() == evt.EventType();
});
The STL algorithms do not give you access to the iterator being used for iteration. This is in most cases a good thing.
(In addition, consider whether you really want to use std::list; it's unlikely that it is the right container for your use case. Consider std::vector, with which you would use the erase/remove idiom to remove elements that satisfy a particular predicate.)
no, use a regular for instead.
for( auto it = EventQueue.begin(); it != EventQueue.end(); ++it )
{
auto pEvent = *it;
if( pEvent->EventType() == evt.EventType() )
it = EventQueue.erase( it );
);
Erase is not the only time you may need to know iterator from lambda.
To do this in a more general way, I am using & operator (implicit conversion to iterator) like this :
int main (int argc, char* argv []) {
size_t tmp [6] = {0, 1, 2, 3, 4, 5};
std::list<size_t> ls ((size_t*)tmp, (size_t*) &tmp [6]);
//printing next element
std::for_each ((const size_t*)tmp, (const size_t*) &tmp [5], [] (const size_t& s) {
std::cout << s << "->";
std::cout << *(&s +1) << " ";
});
std::cout << std::endl;
}