C++ std::set uniqueness override - c++

How does the std::set<T> container check if two objects are unique? I tried overriding the equality operators (==), but it didn't work.
The reason I want to do this is that I have let's say a class Person and I specify that my Person is the same person if they have the same name (maybe even birthdate, address, etc.).
In ccpreference.com, they write the following (which is a bit unclear to me):
Everywhere the standard library uses the Compare concept, uniqueness
is determined by using the equivalence relation. In imprecise terms,
two objects a and b are considered equivalent (not unique) if neither
compares less than the other: !comp(a, b) && !comp(b, a).
I assume, that this question also expands to other STL containers and even algorithms (maybe even to the whole STL). So if in future, I want to use the function std::find, I would be looking up the name of the person and not the object itself. Is this correct?
EDIT
I want to add some example code.
// My operator overloading comparing two strings.
bool operator==(Node & rhs) const {
return this->name.compare(rhs.name);
}
Then, in the UnitTest I add twice an object with the same name into the set. It is added twice (but should be the same according to the operator==.
void test_adding_two_identical_nodes() {
// The pool is a set<Node> inside
model::Node_Pool pool{};
pool.store_node(model::Node{"Peter"});
pool.store_node(model::Node{"Peter"});
// Should be only 1 because the same node should be added once into a set.
ASSERT_EQUAL(1, pool.size());
}

std::set<T> doesn't compare using ==. It compares, by default, using std::less<T>. In turn std::less<T> uses, by default, the operator <.
One way to implement a set is to override operator<, like so:
#include <set>
#include <cassert>
struct Person {
const char *name;
int uid;
};
bool operator<(const Person& a, const Person& b) {
return a.uid < b.uid;
}
int main () {
Person joe = {"joseph", 1};
Person bob = {"robert", 2};
Person rob = {"robert", 3};
Person sue = {"susan", 4};
std::set<Person> people;
people.insert(joe);
people.insert(bob);
people.insert(rob);
assert(people.count(joe) == 1);
assert(people.count(bob) == 1);
assert(people.count(rob) == 1);
assert(people.count(sue) == 0);
Person anonymous_3 = {"", 3};
assert( std::strcmp(people.find(anonymous_3)->name, "robert") == 0);
}
Alternatively, one can pass a compare operator as a template parameter when declaring the set. In the example above, this might be the compare operator:
struct Person_Compare {
bool operator()(const Person& a, const Person& b) const {
return a.uid < b.uid;
}
};
And the std::set declaration might look like this:
std::set<Person, Person_Compare> people;
The rest of the example is unchanged.

First of all, don't override comparison operators to compare anything but TOTAL equivalence. Otherwise you end up with a maintenance nightmare.
That said, you'd override operator <. You should instead give set a comparitor type though.
struct compare_people : std::binary_function<person,person,bool>
{
bool operator () ( person const& a, person const& b) const { return a.name() < b.name();
};
std::set<person, compare_people> my_set;

Related

Differences between various Custom Comparator functions in C++

I found that there are different ways to define custom compare functions for a user defined object. I would like to know the things that I should take into account before choosing one over another.
If I have student object, I can write a custom compare function in the following ways.
struct Student
{
string name;
uint32_t age;
// Method 1: Using operator <
bool operator<(const Student& ob)
{
return age < ob.age;
}
};
// Method 2: Custom Compare Function
bool compStudent(const Student& a, const Student& b)
{
return a.age < b.age;
}
// Method 3: Using operator ()
struct MyStudComp
{
bool operator() (const Student& a, const Student& b)
{
return a.age < b.age;
}
}obComp;
To sort a vector of students I can use either of the below methods.
vector<Student> studs; // Consider I have this object populated
std::sort(studs.begin(), studs.end()); // Method 1
std::sort(studs.begin(), studs.end(), compStudent); // Method 2
std::sort(studs.begin(), studs.end(), obComp); // Method 3
// Method 4: Using Lambda
sort(studs.begin(), studs.end(),
[](const Student& a, const Student& b) -> bool
{
return a.age < b.age;
});
How are these methods different and how should I decide between these. Thanks in advance.
The performance between the different methods is not very different, however, using < will let you be more flexible, and makes using built-ins much easier. I also think using () is kind of weird.
The bigger issue in your example is that your methods should be using const refs instead of values. I.e. bool operator<(Student ob) could be friend bool operator<(const Student& ls, const Student& rs){...}. Also, see here for some examples of different things to consider when overloading operators.
The performance is not going to be noticably different. But it's convenient (and expected) in many cases to have a operator<, so I'd go for that over the special compare function.
There really is no "right" way per se, but if it makes sense for your object to have custom comparators (i.e. operator< etc.) then it would be wise to simply use those. However you may want to sort your object based on a different field member and so providing a custom lambda based on those field comparisons would make sense in that case.
For example, your Student class currently uses an overloaded operator< to compare student ages, so if you are sorting a container of Students based on age then just use this operator implicitly. However, you may want (at another time) to sort based on the names so in this case you could provide a custom lambda as the most elegant method:
std::vector<Student> vec;
// populate vec
std::sort(vec.begin(), vec.end(), [](auto& lhs, auto& rhs) { return lhs.name < rhs.name; });
where the student names are sorted via lexicographical comparisons.
How are these methods different and how should I decide between these.
They differ in their implicit statements of intent. You should use the form that expresses your intent most succinctly.
Relying on operator< implies to someone reading your code that your objects are implicitly ordered, like numbers or strings. They ought to be things that people would say, "well obviously x comes before y".
If the ordering of the map is more abstract, then an ordering function might be better because it expresses the idea that you are imposing an order on the map which may not be a natural order.
in the example you give, I might choose to express the intent in either a function object called ageIsLess for example. As a reader of code using the map is now fully aware of intent.
For example:
#include <cstdint>
#include <set>
#include <string>
#include <algorithm>
#include <iterator>
struct Student
{
std::string name;
std::uint32_t age;
};
struct ByAscendingAge
{
bool operator() (const Student& a, const Student& b) const
{
return a.age < b.age;
}
};
bool age_is_less(const Student& l, const Student& r)
{
return l.age < r.age;
};
bool name_is_less(const Student& l, const Student& r)
{
return l.name < r.name;
};
int main()
{
// this form expresses the intent that any 2 different maps of this type can have different ordering
using students_by_free_function = std::set<Student, bool (*)(const Student&, const Student&)>;
// ordered by age
students_by_free_function by_age_1(age_is_less);
// ordered by name
students_by_free_function by_name_1(name_is_less);
// above two maps are the same type so we can assign them, which implicitly reorders
by_age_1 = by_name_1;
// this form expresses the intent that the ordering is a PROPERTY OF THIS TYPE OF SET
using students_by_age = std::set<Student, ByAscendingAge>;
// note that we don't need a comparator in the constructor
students_by_age by_age_2;
// by_age_2 = by_age_1; // not allowed because the sets are a different type
// but we can assign iterator ranges of course
std::copy(std::begin(by_age_1),
std::end(by_age_1),
std::inserter(by_age_2,
std::end(by_age_2)));
}

How do you order objects in a priority_queue in C++?

I couldn't find any information on how to order objects in a priority queue. I tried this:
class Person {
...
public:
bool operator<(const Person& p) {
return age < p.age;
}
}
int main() {
priority_queue<Person*> people;
people.push(new Person("YoungMan", 21));
people.push(new Person("Grandma", 83));
people.push(new Person("TimeTraveler", -5000));
people.push(new Person("Infant", 1));
while (!people.empty()) {
cout << people.top()->name;
delete people.top();
people.pop();
}
And it's supposed to give priority based on age (older people get higher priority, and thus leave the queue first), but it doesn't work. But I'm getting this output:
Infant
Grandma
TimeTraveler
YoungMan
And I have no idea what this is ordered by, but it's definitely not age.
priority_queue<Person*> actually orders based on comparing the memory addresses of Person object using the comparator std::less<Person*>.
Declare a priority_queue<Person> instead to order based on the operator< you provided.
Or if you insist on using pointers (for some reason) then declare as:
auto age_comp = [](const std::unique_ptr<Person>& lhs, const std::unique_ptr<Person>& rhs) -> bool {
return *lhs < *rhs;
};
std::priority_queue<std::unique_ptr<Person>, std::vector<std::unique_ptr<Person>>,
decltype(age_comp)> people(age_comp);
// note: must pass age_comp to std::priority_queue constructor here as
// lambda closure types have deleted default constructors
Note that this is using smart pointers not raw pointers, the former are much more commonly used in modern C++ - don't use raw pointers unless you have a very good reason to.
Also, operator< of Person should be const specified as it shouldn't change the Person object it belongs to at any point - the comparator of std::priority_queue expects the const and will likely throw an error if the operator< does not have const specification. So, alter operator< to:
bool operator<(const Person& p) const {
return age < p.age;
}

std::find not using my defined == operator

I have a simple class that I am storing in a vector as pointers. I want to use a find on the vector but it is failing to find my object. Upon debugging it doesn't seem to call the == operator I've provided. I can 'see' the object in the debugger so I know its there. The code below even uses a copy of the first item in the list, but still fails. The only way I can make it pass is to use MergeLine* mlt = LineList.begin(), which shows me that it is comparing the objects and not using my equality operator at all.
class MergeLine {
public:
std::string linename;
int StartIndex;
double StartValue;
double FidStart;
int Length;
bool operator < (const MergeLine &ml) const {return FidStart < ml.FidStart;}
bool operator == (const MergeLine &ml) const {
return linename.compare( ml.linename) == 0;}
};
Class OtherClass{
public:
std::vector<MergeLine*>LineList;
std::vector<MergeLine*>::iterator LL_iter;
void DoSomething( std::string linename){
// this is the original version that returned LineList.end()
// MergeLine * mlt
// mlt->linename = linename;
// this version doesn't work either (I thought it would for sure!)
MergeLine *mlt =new MergeLine(*LineList.front());
LL_iter = std::find(LineList.begin(), LineList.end(), mlt);
if (LL_iter == LineList.end()) {
throw(Exception("line not found in LineList : " + mlt->linename));
}
MergeLine * ml = *LL_iter;
}
};
cheers,
Marc
Since your container contains pointers and not objects, the comparison will be between the pointers. The only way the pointers will be equal is when they point to the exact same object. As you've noticed the comparison operator for the objects themselves will never be called.
You can use std::find_if and pass it a comparison object to use.
class MergeLineCompare
{
MergeLine * m_p;
public:
MergeLineCompare(MergeLine * p) : m_p(p)
{
}
bool operator()(MergeLine * p)
{
return *p == *m_p;
}
};
LL_iter = std::find_if(LineList.begin(), LineList.end(), MergeLineCompare(mlt));
I think what you really want is to use std::find_if like this:
struct MergeLineNameCompare
{
std::string seachname;
MergeLineNameComp(const std::string &name) : seachname(name)
{
}
bool operator()(const MergeLine * line)
{
return seachname.compare( line->linename ) == 0;
}
};
LL_iter = std::find_if(LineList.begin(), LineList.end(), MergeLineNameCompare(linename) );
The operator == (no matter wich form) is better saved for real comparison of equality.
Operator overloading can't work with pointers as it is ambiguous.
Bjarne Stroustrup :-
References were introduced primarily to support operator overloading.
C passes every function argument by value, and where passing an object
by value would be inefficient or inappropriate the user can pass a
pointer. This strategy doesn’t work where operator overloading is
used. In that case, notational convenience is essential so that a user
cannot be expected to insert address− of operators if the objects are
large.
So, may be not best but still :-
std::vector<MergeLine>LineList;
std::vector<MergeLine>::iterator LL_iter;

Object comparison with Sorting. C++

I'm having a hard time trying to understand other people's codes here.
I would really appreciate if someone helps me.
Let's say there is an array of object : vpair_list and this vpair_list has a type of class of vpair. So, it would be like this:
class vpair
{
public:
int vid;
int vlabel;
};
bool operator < (const vpair& x, const vpair& y);
vpair* vpair_list;
vpair_list = new vpair[25];
..
sort(vpair_list, vpair_list+j);
What I know from that is sort() compares each element of array vpair_list and sorts them.
The thing is that I just can't understand how that sorting works since the object vpair has two different properties.
Does the sorting work like comparing each property(vid and vlabel) or....? What I thought was the sorting was supposed to be done by comparing specific field or property (either vid or vlabel here).
But this code hasn't got anything to do with that and seems like it just compares the whole object. Could someone tell me how that works?
Thank you in advance.
The standard approach:
class vpair
{
public:
int vid;
int vlabel;
};
bool operator < (vpair const& x, vpair const& y)
{
return std::tie(x.vid, x.vlabel) < std::tie(y.vid, y.vlabel);
}
Of course, the operator can be a member:
class vpair
{
int vid;
int vlabel;
public:
bool operator < (vpair const& y) const
{
return std::tie(vid, vlabel) < std::tie(y.vid, y.vlabel);
}
};
Sort, by default, compares with the operator<. You can implement this operator for your class like so:
public:
bool operator < (const vpair& other) const
{
return (vid < other.vid); // Uses vid but this can be vlable or something else.
}
If you don't have an overload for the operator< with the class you're using, you can always pass in a comparison function as std::sort's third argument:
bool compare_func(vpair i,vpair j) { return (i.vid < j.vid); }
sort(vpair_list, vpair_list+j, compare_func);
Does the sorting work like comparing each property(vid and vlabel) or....?
It happens exactly how you want it to happen.
By default as people have mentioned, the < operator is used by various sort algorithms to arrange elements in ascending order of that operator. However for classes/structs there is no default way to compare them meaning you the programmer has to code it in.
That is what
bool operator < (const vpair& x, const vpair& y);
is. It is just a declaration to the definition of the function the programmer has provided to compare 2 vpair order. The programmer uses his rules to decide and ultimately returns true or false. This is used to sort.
So you can decide exactly how you want it to sort.
bool operator < (const vpair& x, const vpair& y)
{
if(x.vid != y.vid)
return x.vid<y.vid;
return x.vlabel <y.vlabel;
}
This would sort by ascending order of ID, if they are equal, It then sorts by ascending order of vlabel.

Avoiding Helper Functions for Doing Comparisons

Say I have a type with a member function:
class Thing {
std::string m_name;
public:
std::string & getName() {
return m_name;
}
};
And say I have a collection of that type:
std::vector<Thing> things;
And I want to keep the things in order by name. To do that, I use std::lower_bound to figure out where to put it:
bool thingLessThan(Thing const& thing, std::string const& name) {
return thing.getName() < name;
}
void addThing(std::string const& name) {
vector<Thing>::iterator position = lower_bound(
things.begin(), things.end(),
name,
thingLessThan);
if (position == things.end() || position->getName() != name) {
position = things.insert(position, Thing());
position->getName() = name;
}
}
Is there a way to do the same thing as the thingLessThan function without actually creating a function, perhaps using std::mem_fun, std::less, etc?
Other than a lambda you can simply define an operator< which adheres to strict weak ordering to allow a container of your object to be comparable by STL algorithms with the default predicate std::less
class whatever
{
public:
bool operator<(const whatever& rhs) const { return x < rhs.x; }
private:
int x;
};
std::vector<whatever> v;
std::sort(v.begin(), v.end());
Sure. You can use a lambda expression (assuming your compiler supports it):
vector<Thing>::iterator position = lower_bound(
things.begin(), things.end(),
name,
[](Thing const& thing, std::string const& name) { return thing.getName() < name; });
Of course, an alternative option is just to define operator< for the class, then it will be used by default, if you don't specify another comparer function for std::lower_bound.
Depending on what your purpose is? If you just like the syntactic niceness of not declaring something to be used in one place, use lambda expressions to create an anonymous function.
You can overload operator<() and use std::less<T> if you don't want to write predicates contantly. Also you can use lambda-expressions, which would be much nicer, because operator<() is logically connected only with things, that can be put in some order in obvious ways, like numbers or strings.
If you use a std::map, the strings will be placed in alphabetical order automatically. If you want to modify the ordering further, create your own key comparison function. I think this would be the simplest option.
To use a std::list, you can write your own comparison code inside of the addThing() function that goes through the list looking at each string and inserts the new one at the appropriate place.