Does the STL set equality operator check size first?

Does the STL set equality operator check size first? - c++

When I'm using the == or != operator to compare two sets, does that operator actually compare the size of the two sets first? I'm wondering if I need to manually compare the two sizes first to make it more efficient, or if I would actually be making it less efficient. I know the equality and inequality operators will check size, I just don't know if it will do so first.
bool checkEqualTo( const set<int> & set1, const set<int> & set2 )
{
// Should I include comparison of sizes first?
if ( set1.size() != set2.size() )
{
return false;
}
if ( set1 != set2 )
{
return false;
}
return true;
}

Yes, that's the first thing that's checked — from the C++11 standard, §23.2.1 table 96 (Container requirements):
Expression:
a == b (where a and b denote values of type X and X denotes a container class containing objects of type T)
Operational semantics:
distance(a.begin(), a.end()) == distance(b.begin(), b.end()) &&
equal(a.begin(), a.end(), b.begin())

Related

Comparator for matching point in a range

I need to create a std::set of ranges for finding matching points in these ranges. Each range is defined as follows:
struct Range {
uint32_t start;
uint32_t end;
uint32_t pr;
};
In this structure start/end pair identify each range. pr identifies the priority of that range. It means if a single point falls into 2 different ranges, I like to return range with smaller pr. I like to create a std::set with a transparent comparator to match points like this:
struct RangeComparator {
bool operator()(const Range& l, const Range& r) const {
if (l.end < r.start)
return true;
if (l.end < r.end && l.pr >= r.pr)
return true;
return false;
}
bool operator()(const Range& l, uint32_t p) const {
if (p < l.start)
return true;
return false;
}
bool operator()(uint32_t p, const Range& r) const {
if (p < r.start)
return true;
return false;
}
using is_transparent = int;
};
std::set<Range, RangeComparator> ranges;
ranges.emplace(100,250,1);
ranges.emplace(200,350,2);
auto v1 = ranges.find(110); // <-- return range 1
auto v2 = ranges.find(210); // <-- return range 1 because pr range 1 is less
auto v3 = ranges.find(260); // <-- return range 2
I know my comparators are wrong. I wonder how I can write these 3 comparators to answer these queries correctly? Is it possible at all?

find returns an element that compares equivalent to the argument. Equivalent means that it compares neither larger nor smaller in the strict weak ordering provided to the std::set.
Therefore, to make your use case work, you want all points in a range to compare equivalent to the range.
If two ranges overlap, then the points shared by the two ranges need to compare equivalent to both ranges. The priority doesn't matter for this, since the equivalence should presumably hold if only one of the ranges is present.
However, one of the defining properties of a strict weak ordering is that the property of comparing equivalent is transitive. Therefore in this ordering the two ranges must then also compare equal in order to satisfy the requirements of std::set.
Therefore, as long as the possible ranges are not completely separated, the only valid strict weak ordering is the one that compares all ranges and points equivalent.
This is however not an order that would give you what you want.
This analysis holds for all standard library associative containers, since they have the same requirements on the ordering.

How does std::set comparator function work?

Currently working on an algorithm problems using set.
set<string> mySet;
mySet.insert("(())()");
mySet.insert("()()()");
//print mySet:
(())()
()()()
Ok great, as expected.
However if I put a comp function that sorts the set by its length, I only get 1 result back.
struct size_comp
{
bool operator()(const string& a, const string& b) const{
return a.size()>b.size();
}
};
set<string, size_comp> mySet;
mySet.insert("(())()");
mySet.insert("()()()");
//print myset
(())()
Can someone explain to me why?
I tried using a multi set, but its appending duplicates.
multiset<string,size_comp> mSet;
mSet.insert("(())()");
mSet.insert("()()()");
mSet.insert("()()()");
//print mset
"(())()","()()()","()()()"

std::set stores unique values only. Two values a,b are considered equivalent if and only if
!comp(a,b) && !comp(b,a)
or in everyday language, if a is not smaller than b and b is not smaller than a. In particular, only this criterion is used to check for equality, the normal operator== is not considered at all.
So with your comparator, the set can only contain one string of length n for every n.
If you want to allow multiple values that are equivalent under your comparison, use std::multiset. This will of course also allow exact duplicates, again, under your comparator, "asdf" is just as equivalent to "aaaa" as it is to "asdf".
If that does not make sense for your problem, you need to come up with either a different comparator that induces a proper notion of equality or use another data structure.
A quick fix to get the behavior you probably want (correct me if I'm wrong) would be introducing a secondary comparison criterion like the normal operator>. That way, we sort by length first, but are still able to distinguish between different strings of the same length.
struct size_comp
{
bool operator()(const string& a, const string& b) const{
if (a.size() != b.size())
return a.size() > b.size();
return a > b;
}
};

The comparator template argument, which defaults to std::less<T>, must represent a strict weak ordering relation between values in its domain.
This kind of relation has some requirements:
it's not reflexive (x < x yields false)
it's asymmetric (x < y implies that y < x is false)
it's transitive (x < y && y < z implies x < z)
Taking this further we can define equivalence between values in term of this relation, because if !(x < y) && !(y < x) then it must hold that x == y.
In your situation you have that ∀ x, y such that x.size() == y.size(), then both comp(x,y) == false && comp(y,x) == false, so since no x or y is lesser than the other, then they must be equal.
This equivalence is used to determine if two items correspond to the same, thus ignoring second insertion in your example.
To fix this you must make sure that your comparator never returns false for both comp(x,y) and comp(y,x) if you don't want to consider x equal to y, for example by doing
auto cmp = [](const string& a, const string& b) {
if (a.size() != b.size())
return a.size() > b.size();
else
return std::less()(a, b);
}
So that for input of same length you fallback to normal lexicographic order.

This is because equality of elements is defined by the comparator. An element is considered equal to another if and only if !comp(a, b) && !comp(b, a).
Since the length of "(())()" is not greater, nor lesser than the length of "()()()", they are considered equal by your comparator. There can be only unique elements in a std::set, and an equivalent object will overwrite the existing one.
The default comparator uses operator<, which in the case of strings, performs lexicographical ordering.
I tried using a multi set, but its appending duplicates.
Multiset indeed does allow duplicates. Therefore both strings will be contained despite having the same length.

size_comp considers only the length of the strings. The default comparison operator uses lexicographic comparison, which distinguishes based on the content of the string as well as the length.

multimap with custom keys - comparison function

bool operator<(const Binding& b1, const Binding& b2)
{
if(b1.r != b2.r && b1.t1 != b2.t1)
{
if(b1.r != b2.r)
return b1.r < b2.r;
return b1.t1 < b2.t1;
}
return false;
}
I have a comparison function like above. Basically, I need to deem the objects equal if one of their attribute matches. I am using this comparison function for my multimap whose key is 'Binding' object.
The problem I face is that lower_bound and upper_bound functions return the same iterator which points to a valid object. For example (t1 = 1, r = 2) is already in the map and when I try to search it in the map with (t1 = 1, r = 2), I get a same iterator as return value of upper_bound and lower_bound functions.
Is anything wrong with the comparison function? Is there a way to figure a function where I can still ensure that the objects are equivalent even if just one of their field matches?
Shouldn't the upper_bound iterator return the object past the

The comparator for a map or multimap is expected to express a strict weak ordering relation between the set of keys. Your requirement "two objects are equivalent if just one of their fields matches" cannot be such a relation. Take these three keys:
1: r=1, t1=10
2: r=1, t1=42
3: r=2, t1=42
clearly, keys 1 and 2 are equivalent, because they have the same r. Likewise, 2 and 3 are equivalent because of the same t1. That means, that 1 and 3 have to be equivalent as well, although they have no matching fields.
As a corollary, all possible keys have to be equivalent under these circumstances, which means you dont have any ordering at all and a multimap is not the right way to go.
For your case, Boost.MultiIndex comes to mind. You could then have two separate indices for r and t1 and do your lower_bound, upper_bound and equal_range searches over both indices separately.

Your comparision function after removing redundant code can be re-written as
bool operator<(const Binding& b1, const Binding& b2)
{
if(b1.r != b2.r && b1.t1 != b2.t1)
{
//if(b1.r != b2.r) // always true
return b1.r < b2.r;
//return b1.t1 < b2.t1; // Never reached
}
return false;
}
Or by de-morgan's law
bool operator<(const Binding& b1, const Binding& b2)
{
if(b1.r == b2.r || b1.t1 == b2.t1) return false;
else return b1.r < b2.r;
}
This does not guarantee a < c if a < b and b < c
Ex: Binding(r, t): a(3, 5), b(4, 6), c(5, 5)
If your comparision function doesn't follow above crieteria, you may get strange results. (including infinite loops in some cases if library is not robust)

Your comparison function will return false if either the rs or the ts match because of the && in the if() clause. Did you mean || ? Replacing the && with || would give you a valid comparison function which compare first by the r field then the t field.
Note that std::pair already has a comparison function that does exactly that.
Your text below your code though states:
Basically, I need to deem the objects equal if one of their attribute matches
You cannot do that as it wouldn't be transitive (thus you wouldn't have strict ordering).
The inside of your if block has another if that is certain to be true, as the && clause means both sides are true.

How to sort only subset in std::vector?

I have a vector of pairs:
std::vector<std::pair<std::string, Cell::Ptr>> mCells;
I want to sort only a subset of elements (on the first's string). The Cell has method GetSorted() which indicates if it's part of this subset or not.
This is what I had initially:
std::sort(mCells.begin(), mCells.end(),
[](std::pair<std::string, Cell::Ptr> const &a,
std::pair<std::string, Cell::Ptr> const &b)
{
// Only compare when both cells need to be sorted; otherwise return false
// to indicate that they are already in correct order. This keeps the
// non-marked cells at their original positions.
if (a.second->GetSorted() && b.second->GetSorted())
{
return a.first < b.first;
}
else
{
return false;
}
});
But it does not work, because sort, of course, does not compare all combinations. Sometimes the return a.first < b.first line is not even executed once.
To define the required sort function, here's an example. Suppose the elements are:
G* F C* A* D B E*
Only the *-ones need to be sorted. But, the sort should only be applied to adjacent to-be-sorted elements. (That's why I had a.second->GetSorted() && b.second->GetSorted().) The result should then be:
G* F A* C* D B E*
So, only A and C are adjacent, and are sorted. Is there an easy solution to this problem?
Alternatively a solution that results in:
A* F C* E* D B G*
would also be usable for me at the moment. So, sorting all * elements, while leaving the others where they are. This appears to be easier to do.

You need to separate finding the ranges to be sorted and sorting them:
using namespace std;
auto isSorted = [](std::pair<std::string, Cell::Ptr> const &a) {
return a.second->GetSorted();
}
auto it = begin(mCells);
const auto itEnd = end(mCells);
while (it != itEnd) {
auto rangeStart = find_if(it, itEnd, isSorted);
if (rangeStart == itEnd)
break;
auto rangeEnd = find_if_not(rangeStart, itEnd, isSorted);
if (distance(rangeStart, rangeEnd) > 1) {
// pair comparison should do the trick here
sort(rangeStart, rangeEnd);
}
it = rangeEnd;
}
Just saw your edit: you can achieve the alternate solution by defining a custom input iterator class that skips non-sorted elements, then using a single sort() call on the whole "range".

find the difference between two sets of pointers to the same object

How can i find the difference between two sets of pointers to the same object?
Is there an efficient way without iterating through all the objects of both sets.
i have two of these sets:
std::set<Object*>
If an object private member(name) is the same as the other objects name that means that the object is the same.

STL's algorithm library is awesome, extensible, and underused.
This will give you the set difference as a vector (I suppose you could convert that to a set, but there's no need, at least for what you asked, and a vector is faster since the sets are already sorted).
template<typename T>
std::vector<T> set_diff(std::set<T> const &a, std::set<T> const &b) {
std::vector v<T>;
std::set_difference(a.begin(), a.end(), b.begin(), b.end(), v.begin());
return v;
}
Optionally, put after the constructor
v.reserve(a.size() + b.size());
and before the return (C++11)
v.shrink_to_fit();
Note: This yields the items in a that are not in b. To find all items in one of the two but not the other, use std::set_symmetric_difference instead.

I think what you mean different is finding pointer elements which only appear in one set. The most efficient way is to iterate the two sets synchronously and this will cost only O(n+m) time, in which n, m denote the size of two sets, which in general case is the lower bound for the problem.
Luckily, STL container set use balanced binary search tree as its base, we can iterate all the elements in order in linear time, so O(n+m) can be achieved.
template<typename T>
std::vector<T> set_diff(std::set<T> const &a, std::set<T> const &b) {
std::vector<T> v;
auto ita = a.begin();
auto itb = b.begin();
while (ita != a.end() && itb != b.end()) {
if (*ita == *itb) {
++ita, ++itb;
} else if (*ita < *itb) {
v.push_back(*ita);
++ita;
} else {
v.push_back(*itb);
++itb;
}
}
for (; ita != a.end(); v.push_back(*ita), ++ita);
for (; itb != b.end(); v.push_back(*itb), ++itb);
return v;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Does the STL set equality operator check size first? - c++

Related

Comparator for matching point in a range

How does std::set comparator function work?

multimap with custom keys - comparison function

How to sort only subset in std::vector?

find the difference between two sets of pointers to the same object

Categories

Resources