Comparator can be used to set a new key, isn't? - c++

I need to change the "key" of a multiset:
multiset<IMidiMsgExt, IMidiMsgExtCompByNoteNumber> playingNotes;
such as that when I use the .find() function it search and return the first object (iterator) with that NoteNumber property value.
I said "first" because my multiset list could contains objects with the same "key". So I did:
struct IMidiMsgExtCompByNoteNumber {
bool operator()(const IMidiMsgExt& lhs, const IMidiMsgExt& rhs) {
return lhs.NoteNumber() < rhs.NoteNumber();
}
};
but when I try to do:
auto it = playingNotes.find(60);
the compiler says no instance of overloaded function "std::multiset<_Kty, _Pr, _Alloc>::find [with _Kty=IMidiMsgExt, _Pr=IMidiMsgExtCompByNoteNumber, _Alloc=std::allocator<IMidiMsgExt>]" matches the argument list
Am I misunderstanding the whole thing? What's wrong?

I do believe that you have some misunderstandings here:
Part of an associative container's type is it's key type and comparator. Because C++ is strongly typed the only way to change the comparator on a container is to create a new container, copying or moving all the elements into it
Creating a copy of all the elements in a container is a potentially expensive process
By creating a copy you are violating the Single Source of Truth best practice
multiset is used infrequently, I have used it once in my career, others have pointed out it's shortcomings and recommended that you use another container, write your own container, or in my case I'd suggests simply using vector and sorting it how you want when you have to
I'm going to catalog your comments to show how the answer I've already given you is correct:
We're going to assume that the multiset<IMidiMsgExt, IMidiMsgExtCompByNoteNumber> that you've selected is necessary and cannot be improved upon by using vector as suggested in 4, where:
struct IMidiMsgExtCompByNoteNumber {
bool operator()(const IMidiMsgExt& lhs, const IMidiMsgExt& rhs) {
return lhs.NoteNumber() < rhs.NoteNumber();
}
};
You cannot use multiset::find because that requires you tospecify the exact IMidiMsgExt you are searching for; so you'll need to use find_if(cbegin(playingNotes), cend(playingNotes), [value = int{60}](const auto& i){return i.mNote == value;}) to search for a specific property value. Which will be fine to use on to use directly on PlayingNotes without changing the sorting, because you say:
I want to delete the first note that has mNote of 60. No matter the mTime when deleting.
You'll need to capture the result of the [find_if], check if it is valid, and if so erase it as demonstrated in my answer, because you say:
The first element find will find for that, erase. [sic]
I would roll the code from my answer into a function because you say:
Ill recall find if I want another element, maybe with same value, to get deleted [sic]
Your final solution should be to write a function like this:
bool foo(const multiset<IMidiMsgExt, IMidiMsgExtCompByNoteNumber>& playingNotes, const int value) {
const auto it = find_if(cbegin(playingNotes), cend(playingNotes), [=](const auto& i){return i.mNote == value;});
const auto result = it != cend(playingNotes);
if(result) {
playingNotes.erase(it);
}
return result;
}
And you'd call it something like this: foo(playingNotes, 60) if you wish to know whether an element was removed you may test foo's return.

Related

Why does std::set not have a "contains" member function?

I'm heavily using std::set<int> and often I simply need to check if such a set contains a number or not.
I'd find it natural to write:
if (myset.contains(number))
...
But because of the lack of a contains member, I need to write the cumbersome:
if (myset.find(number) != myset.end())
..
or the not as obvious:
if (myset.count(element) > 0)
..
Is there a reason for this design decision ?
I think it was probably because they were trying to make std::set and std::multiset as similar as possible. (And obviously count has a perfectly sensible meaning for std::multiset.)
Personally I think this was a mistake.
It doesn't look quite so bad if you pretend that count is just a misspelling of contains and write the test as:
if (myset.count(element))
...
It's still a shame though.
To be able to write if (s.contains()), contains() has to return a bool (or a type convertible to bool, which is another story), like binary_search does.
The fundamental reason behind the design decision not to do it this way is that contains() which returns a bool would lose valuable information about where the element is in the collection. find() preserves and returns that information in the form of an iterator, therefore is a better choice for a generic library like STL. This has always been the guiding principle for Alex Stepanov, as he has often explained (for example, here).
As to the count() approach in general, although it's often an okay workaround, the problem with it is that it does more work than a contains() would have to do.
That is not to say that a bool contains() isn't a very nice-to-have or even necessary. A while ago we had a long discussion about this very same issue in the
ISO C++ Standard - Future Proposals group.
It lacks it because nobody added it. Nobody added it because the containers from the STL that the std library incorporated where designed to be minimal in interface. (Note that std::string did not come from the STL in the same way).
If you don't mind some strange syntax, you can fake it:
template<class K>
struct contains_t {
K&& k;
template<class C>
friend bool operator->*( C&& c, contains_t&& ) {
auto range = std::forward<C>(c).equal_range(std::forward<K>(k));
return range.first != range.second;
// faster than:
// return std::forward<C>(c).count( std::forward<K>(k) ) != 0;
// for multi-meows with lots of duplicates
}
};
template<class K>
containts_t<K> contains( K&& k ) {
return {std::forward<K>(k)};
}
use:
if (some_set->*contains(some_element)) {
}
Basically, you can write extension methods for most C++ std types using this technique.
It makes a lot more sense to just do this:
if (some_set.count(some_element)) {
}
but I am amused by the extension method method.
The really sad thing is that writing an efficient contains could be faster on a multimap or multiset, as they just have to find one element, while count has to find each of them and count them.
A multiset containing 1 billion copies of 7 (you know, in case you run out) can have a really slow .count(7), but could have a very fast contains(7).
With the above extension method, we could make it faster for this case by using lower_bound, comparing to end, and then comparing to the element. Doing that for an unordered meow as well as an ordered meow would require fancy SFINAE or container-specific overloads however.
You are looking into particular case and not seeing bigger picture. As stated in documentation std::set meets requirement of AssociativeContainer concept. For that concept it does not make any sense to have contains method, as it is pretty much useless for std::multiset and std::multimap, but count works fine for all of them. Though method contains could be added as an alias for count for std::set, std::map and their hashed versions (like length for size() in std::string ), but looks like library creators did not see real need for it.
Although I don't know why std::set has no contains but count which only ever returns 0 or 1,
you can write a templated contains helper function like this:
template<class Container, class T>
auto contains(const Container& v, const T& x)
-> decltype(v.find(x) != v.end())
{
return v.find(x) != v.end();
}
And use it like this:
if (contains(myset, element)) ...
The true reason for set is a mystery for me, but one possible explanation for this same design in map could be to prevent people from writing inefficient code by accident:
if (myMap.contains("Meaning of universe"))
{
myMap["Meaning of universe"] = 42;
}
Which would result in two map lookups.
Instead, you are forced to get an iterator. This gives you a mental hint that you should reuse the iterator:
auto position = myMap.find("Meaning of universe");
if (position != myMap.cend())
{
position->second = 42;
}
which consumes only one map lookup.
When we realize that set and map are made from the same flesh, we can apply this principle also to set. That is, if we want to act on an item in the set only if it is present in the set, this design can prevent us from writing code as this:
struct Dog
{
std::string name;
void bark();
}
operator <(Dog left, Dog right)
{
return left.name < right.name;
}
std::set<Dog> dogs;
...
if (dogs.contain("Husky"))
{
dogs.find("Husky")->bark();
}
Of course all this is a mere speculation.
Since c++20,
bool contains( const Key& key ) const
is available.
I'd like to point out , as mentioned by Andy, that since C++20 the standard added the contains Member function for maps or set:
bool contains( const Key& key ) const; (since C++20)
Now I'd like to focus my answer regarding performance vs readability.
In term of performance if you compare the two versions:
#include <unordered_map>
#include <string>
using hash_map = std::unordered_map<std::string,std::string>;
hash_map a;
std::string get_cpp20(hash_map& x,std::string str)
{
if(x.contains(str))
return x.at(str);
else
return "";
};
std::string get_cpp17(hash_map& x,std::string str)
{
if(const auto it = x.find(str); it !=x.end())
return it->second;
else
return "";
};
You will find that the cpp20 version takes two calls to std::_Hash_find_last_result while the cpp17 takes only one call.
Now I find myself with many data structure with nested unordered_map.
So you end up with something like this:
using my_nested_map = std::unordered_map<std::string,std::unordered_map<std::string,std::unordered_map<int,std::string>>>;
std::string get_cpp20_nested(my_nested_map& x,std::string level1,std::string level2,int level3)
{
if(x.contains(level1) &&
x.at(level1).contains(level2) &&
x.at(level1).at(level2).contains(level3))
return x.at(level1).at(level2).at(level3);
else
return "";
};
std::string get_cpp17_nested(my_nested_map& x,std::string level1,std::string level2,int level3)
{
if(const auto it_level1=x.find(level1); it_level1!=x.end())
if(const auto it_level2=it_level1->second.find(level2);it_level2!=it_level1->second.end())
if(const auto it_level3=it_level2->second.find(level3);it_level3!=it_level2->second.end())
return it_level3->second;
return "";
};
Now if you have plenty of condition in-between these ifs, using the iterator really is painful, very error prone and unclear, I often find myself looking back at the definition of the map to understand what kind of object was at level 1 or level2, while with the cpp20 version , you see at(level1).at(level2).... and understand immediately what you are dealing with.
So in term of code maintenance/review, contains is a very nice addition.
What about binary_search ?
set <int> set1;
set1.insert(10);
set1.insert(40);
set1.insert(30);
if(std::binary_search(set1.begin(),set1.end(),30))
bool found=true;
contains() has to return a bool. Using C++ 20 compiler I get the following output for the code:
#include<iostream>
#include<map>
using namespace std;
int main()
{
multimap<char,int>mulmap;
mulmap.insert(make_pair('a', 1)); //multiple similar key
mulmap.insert(make_pair('a', 2)); //multiple similar key
mulmap.insert(make_pair('a', 3)); //multiple similar key
mulmap.insert(make_pair('b', 3));
mulmap.insert({'a',4});
mulmap.insert(pair<char,int>('a', 4));
cout<<mulmap.contains('c')<<endl; //Output:0 as it doesn't exist
cout<<mulmap.contains('b')<<endl; //Output:1 as it exist
}
Another reason is that it would give a programmer the false impression that std::set is a set in the math set theory sense. If they implement that, then many other questions would follow: if an std::set has contains() for a value, why doesn't it have it for another set? Where are union(), intersection() and other set operations and predicates?
The answer is, of course, that some of the set operations are already implemented as functions in (std::set_union() etc.) and other are as trivially implemented as contains(). Functions and function objects work better with math abstractions than object members, and they are not limited to the particular container type.
If one need to implement a full math-set functionality, he has not only a choice of underlying container, but also he has a choice of implementation details, e.g., would his theory_union() function work with immutable objects, better suited for functional programming, or would it modify its operands and save memory? Would it be implemented as function object from the start or it'd be better to implement is a C-function, and use std::function<> if needed?
As it is now, std::set is just a container, well-suited for the implementation of set in math sense, but it is nearly as far from being a theoretical set as std::vector from being a theoretical vector.

error: expression must be a modifiable lvalue when using find_if

I have two vectors of classes that contain mainly strings, and I'm trying to keep track of how many times there was a match between two vectors. I kept an int counter in one of the two public classes (necessary for another function). However, std::find_if doesn't seem to allow me to modify nor assign this counter variable.
Following is the std::find_if search algorithm:
for (Vector1& v1 : vector1) {
auto res = find_if(vector2.begin(), vector2.end(),
[=](Vector2 v2) {
if (v2.code == v1.code) {
v1.counter++; // <-- where the error occurs
return true;
}
else
return false;
}
);
}
I can't seem to figure out why this happens; my speculation is that the third parameter for the find_if algorithm takes in a const value. But that shouldn't affect my vector1, right?
I used nested ranged for-loops instead, and it works perfectly. However, I'd like to try using this find_if algorithm instead...
You have a problem with capture/pass by value/reference.
It should be [&] or [&v1] - variables captured by value are non-mutable by default, and lambda's operator() is const. You could use the mutable keyword to fix the error, which makes operator() non-const, but you wouldn't see the changes made to v1 anyways.
Additionally, you should be passing by Vector2 const& v2, auto const& or auto && in sake of avoiding making a copy.
Together:
[&v1](Vector2 const& v2) { ... }
I'd like to try using this find_if algorithm instead...
But that's not what it's for. If you aren't going to use the returned iterator, don't do it. You should be getting a warning. Use loops for simple iteration.

Ordering a container on something else than the key

I am currently trying to implement a A* algorithm and I've come to a problem :
I want to keep a set of distinct objects, identified by a hash (I've used boost::hash and family, but can use anything else) and ordered by a public int value, member of those objects.
The goal is being able to retrieve the smaller object based on the int value in O(1) and guarantee uniqueness in the most efficient manner (hash seemed a good way to achieve that, but i'm open to alternatives). I don't need to iterate over the container if those two conditions are met.
Is there any already present implementation that answer those specifications ? Am I mistaken in my assumptions ? Should I just extend any existing container ?
EDIT :
Apparently unclear on what "smaller based on int value" means. I mean that my object has a public attribute (lets say score). For two objects a and b, a < b if and only if a.score < b.score.
I want a and b to be in a container, ordered by score. And if I try to insert c with c.hash == a.hash, I want the insertion to fail.
Although std::priority_queue is an adapter, its Container template parameter has to satisfy SequenceContainer, so you can't build one backed by a std::set.
It looks like your best option is to maintain both a set and a priority queue, and use the former to control insertion into the latter. It may be a good idea to encapsulate that into a container-concept class, but you might get away with a couple of methods if your use of it is quite localised.
use a custom comparator and a std::set :
#include <set>
#include <string>
struct Object
{
int value;
long hash;
std::string data;
Object(int value, std::string data) :
value(value), data(data)
{
}
bool operator<(const Object& other) const
{
return data < other.data;
}
};
struct ObjComp1
{
bool operator()(const Object& lhs, const Object& rhs) const
{
return lhs.value < rhs.value;
}
};
struct ObjComp2
{
bool operator()(const Object& lhs, const Object& rhs) const
{
if (lhs.value != rhs.value)
{
return lhs.value < rhs.value;
}
return lhs < rhs;
}
};
int main()
{
Object o1(5, "a");
Object o2(1, "b");
Object o3(1, "c");
Object o4(1, "c");
std::set<Object, ObjComp1> set;
set.insert(o1);
set.insert(o2);
set.insert(o3);
set.insert(o4);
std::set<Object, ObjComp2> set2;
set2.insert(o1);
set2.insert(o2);
set2.insert(o3);
set2.insert(o4);
return 0;
}
First variant will allow you to only insert o1 and o2, second variant will allow you to insert o1, o2 and o3, as it's not really clear which one you need. The only downside is that you need to code your own operator< for the Object type.
alternatively if you don't want to create a custom operator< for you data type, you can wrap a std::map > but this is less straightforward
You could use the stl type priority_queue. If your elements are integers then you could do:
priority_queue<int> q;
Priority queues are internally implemented with heaps, a complete binary tree whose root always is the minimum element of the set. So, you could consult in O(1) by invoking top().
However, as you algorithm progress, you will need extract the items with pop(). Since is a binary tree, the extraction takes O(log N), which it is not O(1), but is a very good time and it is guaranteed, by contrast with a expected time, which would be the case for an imperfect hash table .
I do not know a way for maintaining a set and extracting the minimum in O(1).

Skipping iterator

I have a sequence of values that I'd like to pass to a function that takes a (iterator begin, iterator end) pair. However, I only want every second element in the original sequence to be processed.
Is there a nice way using Standard-Lib/Boost to create an iterator facade that will allow me to pass in the original sequence? I figured something simple like this would already be in the boost iterators or range libraries, but I didn't find anything.
Or am I missing another completely obvious way to do this? Of course, I know I always have the option of copying the values to another sequence, but that's not what I want to do.
Edit: I know about filter_iterator, but that filters on values - it doesn't change the way the iteration advances.
I think you want boost::adaptors::strided
struct TrueOnEven {
template< typename T >
bool operator()(const T&) { return mCount++ % 2 == 0; }
TrueOnEven() : mCount(0) {}
private:
int mCount;
};
int main() {
std::vector< int > tVec, tOtherVec;
...
typedef boost::filter_iterator< TrueOnEven, int > TakeEvenFilterType;
std::copy(
TakeEvenFilterType(tVec.begin(), tVec.end()),
TakeEvenFilterType(tVec.end(), tVec.end()),
std::back_inserter(tOtherVec));
}
To be honest, this is anything else than nice and intuitive. I wrote a simple "Enumerator" library including lazy integrated queries to avoid hotchpotch like the above. It allows you to write:
Query::From(tVec.begin(), tVec.end())
.Skip<2>()
.ToStlSequence(std::back_inserter(tOtherVec));
where Skip<2> basically instantiates a generalized "Filter" which skips every N-th (in this case every second) element.
Here's Boost's filter iterator. It is exactly what you want.
UPDATE: Sorry, read wrongly-ish. Here's a list of all iterator funkiness in Boost:
http://www.boost.org/doc/libs/1_46_1/libs/iterator/doc/#specialized-adaptors
I think a plain iterator_adaptor with an overloaded operator++ that increments the underlying iterator value twice is all you need.

c++ map find() to possibly insert(): how to optimize operations?

I'm using the STL map data structure, and at the moment my code first invokes find(): if the key was not previously in the map, it calls insert() it, otherwise it does nothing.
map<Foo*, string>::iterator it;
it = my_map.find(foo_obj); // 1st lookup
if(it == my_map.end()){
my_map[foo_obj] = "some value"; // 2nd lookup
}else{
// ok do nothing.
}
I was wondering if there is a better way than this, because as far as I can tell, in this case when I want to insert a key that is not present yet, I perform 2 lookups in the map data structures: one for find(), one in the insert() (which corresponds to the operator[] ).
Thanks in advance for any suggestion.
Normally if you do a find and maybe an insert, then you want to keep (and retrieve) the old value if it already existed. If you just want to overwrite any old value, map[foo_obj]="some value" will do that.
Here's how you get the old value, or insert a new one if it didn't exist, with one map lookup:
typedef std::map<Foo*,std::string> M;
typedef M::iterator I;
std::pair<I,bool> const& r=my_map.insert(M::value_type(foo_obj,"some value"));
if (r.second) {
// value was inserted; now my_map[foo_obj]="some value"
} else {
// value wasn't inserted because my_map[foo_obj] already existed.
// note: the old value is available through r.first->second
// and may not be "some value"
}
// in any case, r.first->second holds the current value of my_map[foo_obj]
This is a common enough idiom that you may want to use a helper function:
template <class M,class Key>
typename M::mapped_type &
get_else_update(M &m,Key const& k,typename M::mapped_type const& v) {
return m.insert(typename M::value_type(k,v)).first->second;
}
get_else_update(my_map,foo_obj,"some value");
If you have an expensive computation for v you want to skip if it already exists (e.g. memoization), you can generalize that too:
template <class M,class Key,class F>
typename M::mapped_type &
get_else_compute(M &m,Key const& k,F f) {
typedef typename M::mapped_type V;
std::pair<typename M::iterator,bool> r=m.insert(typename M::value_type(k,V()));
V &v=r.first->second;
if (r.second)
f(v);
return v;
}
where e.g.
struct F {
void operator()(std::string &val) const
{ val=std::string("some value")+" that is expensive to compute"; }
};
get_else_compute(my_map,foo_obj,F());
If the mapped type isn't default constructible, then make F provide a default value, or add another argument to get_else_compute.
There are two main approaches. The first is to use the insert function that takes a value type and which returns an iterator and a bool which indicate if an insertion took place and returns an iterator to either the existing element with the same key or the newly inserted element.
map<Foo*, string>::iterator it;
it = my_map.find(foo_obj); // 1st lookup
my_map.insert( map<Foo*, string>::value_type(foo_obj, "some_value") );
The advantage of this is that it is simple. The major disadvantage is that you always construct a new value for the second parameter whether or not an insertion is required. In the case of a string this probably doesn't matter. If your value is expensive to construct this may be more wasteful than necessary.
A way round this is to use the 'hint' version of insert.
std::pair< map<foo*, string>::iterator, map<foo*, string>::iterator >
range = my_map.equal_range(foo_obj);
if (range.first == range.second)
{
if (range.first != my_map.begin())
--range.first;
my_map.insert(range.first, map<Foo*, string>::value_type(foo_obj, "some_value") );
}
The insertiong is guaranteed to be in amortized constant time only if the element is inserted immediately after the supplied iterator, hence the --, if possible.
Edit
If this need to -- seems odd, then it is. There is an open defect (233) in the standard that hightlights this issue although the description of the issue as it applies to map is clearer in the duplicate issue 246.
In your example, you want to insert when it's not found. If default construction and setting the value after that is not expensive, I'd suggest simpler version with 1 lookup:
string& r = my_map[foo_obj]; // only lookup & insert if not existed
if (r == "") r = "some value"; // if default (obj wasn't in map), set value
// else existed already, do nothing
If your example tells what you actually want, consider adding that value as str Foo::s instead, you already have the object, so no lookups would be needed, just check if it has default value for that member. And keep the objs in the std::set. Even extending class FooWithValue2 may be cheaper than using map.
But If joining data through the map like this is really needed or if you want to update only if it existed, then Jonathan has the answer.