Comparator for matching point in a range

Comparator for matching point in a range - c++

I need to create a std::set of ranges for finding matching points in these ranges. Each range is defined as follows:
struct Range {
uint32_t start;
uint32_t end;
uint32_t pr;
};
In this structure start/end pair identify each range. pr identifies the priority of that range. It means if a single point falls into 2 different ranges, I like to return range with smaller pr. I like to create a std::set with a transparent comparator to match points like this:
struct RangeComparator {
bool operator()(const Range& l, const Range& r) const {
if (l.end < r.start)
return true;
if (l.end < r.end && l.pr >= r.pr)
return true;
return false;
}
bool operator()(const Range& l, uint32_t p) const {
if (p < l.start)
return true;
return false;
}
bool operator()(uint32_t p, const Range& r) const {
if (p < r.start)
return true;
return false;
}
using is_transparent = int;
};
std::set<Range, RangeComparator> ranges;
ranges.emplace(100,250,1);
ranges.emplace(200,350,2);
auto v1 = ranges.find(110); // <-- return range 1
auto v2 = ranges.find(210); // <-- return range 1 because pr range 1 is less
auto v3 = ranges.find(260); // <-- return range 2
I know my comparators are wrong. I wonder how I can write these 3 comparators to answer these queries correctly? Is it possible at all?

find returns an element that compares equivalent to the argument. Equivalent means that it compares neither larger nor smaller in the strict weak ordering provided to the std::set.
Therefore, to make your use case work, you want all points in a range to compare equivalent to the range.
If two ranges overlap, then the points shared by the two ranges need to compare equivalent to both ranges. The priority doesn't matter for this, since the equivalence should presumably hold if only one of the ranges is present.
However, one of the defining properties of a strict weak ordering is that the property of comparing equivalent is transitive. Therefore in this ordering the two ranges must then also compare equal in order to satisfy the requirements of std::set.
Therefore, as long as the possible ranges are not completely separated, the only valid strict weak ordering is the one that compares all ranges and points equivalent.
This is however not an order that would give you what you want.
This analysis holds for all standard library associative containers, since they have the same requirements on the ordering.

Related

How does std::set comparator function work?

Currently working on an algorithm problems using set.
set<string> mySet;
mySet.insert("(())()");
mySet.insert("()()()");
//print mySet:
(())()
()()()
Ok great, as expected.
However if I put a comp function that sorts the set by its length, I only get 1 result back.
struct size_comp
{
bool operator()(const string& a, const string& b) const{
return a.size()>b.size();
}
};
set<string, size_comp> mySet;
mySet.insert("(())()");
mySet.insert("()()()");
//print myset
(())()
Can someone explain to me why?
I tried using a multi set, but its appending duplicates.
multiset<string,size_comp> mSet;
mSet.insert("(())()");
mSet.insert("()()()");
mSet.insert("()()()");
//print mset
"(())()","()()()","()()()"

std::set stores unique values only. Two values a,b are considered equivalent if and only if
!comp(a,b) && !comp(b,a)
or in everyday language, if a is not smaller than b and b is not smaller than a. In particular, only this criterion is used to check for equality, the normal operator== is not considered at all.
So with your comparator, the set can only contain one string of length n for every n.
If you want to allow multiple values that are equivalent under your comparison, use std::multiset. This will of course also allow exact duplicates, again, under your comparator, "asdf" is just as equivalent to "aaaa" as it is to "asdf".
If that does not make sense for your problem, you need to come up with either a different comparator that induces a proper notion of equality or use another data structure.
A quick fix to get the behavior you probably want (correct me if I'm wrong) would be introducing a secondary comparison criterion like the normal operator>. That way, we sort by length first, but are still able to distinguish between different strings of the same length.
struct size_comp
{
bool operator()(const string& a, const string& b) const{
if (a.size() != b.size())
return a.size() > b.size();
return a > b;
}
};

The comparator template argument, which defaults to std::less<T>, must represent a strict weak ordering relation between values in its domain.
This kind of relation has some requirements:
it's not reflexive (x < x yields false)
it's asymmetric (x < y implies that y < x is false)
it's transitive (x < y && y < z implies x < z)
Taking this further we can define equivalence between values in term of this relation, because if !(x < y) && !(y < x) then it must hold that x == y.
In your situation you have that ∀ x, y such that x.size() == y.size(), then both comp(x,y) == false && comp(y,x) == false, so since no x or y is lesser than the other, then they must be equal.
This equivalence is used to determine if two items correspond to the same, thus ignoring second insertion in your example.
To fix this you must make sure that your comparator never returns false for both comp(x,y) and comp(y,x) if you don't want to consider x equal to y, for example by doing
auto cmp = [](const string& a, const string& b) {
if (a.size() != b.size())
return a.size() > b.size();
else
return std::less()(a, b);
}
So that for input of same length you fallback to normal lexicographic order.

This is because equality of elements is defined by the comparator. An element is considered equal to another if and only if !comp(a, b) && !comp(b, a).
Since the length of "(())()" is not greater, nor lesser than the length of "()()()", they are considered equal by your comparator. There can be only unique elements in a std::set, and an equivalent object will overwrite the existing one.
The default comparator uses operator<, which in the case of strings, performs lexicographical ordering.
I tried using a multi set, but its appending duplicates.
Multiset indeed does allow duplicates. Therefore both strings will be contained despite having the same length.

size_comp considers only the length of the strings. The default comparison operator uses lexicographic comparison, which distinguishes based on the content of the string as well as the length.

Binary search on vector objects of an element greater than an attribute

I have a vector which contains lot of elements of my class X .
I need to find the first occurrence of an element in this vector say S such that S.attrribute1 > someVariable. someVariable will not be fixed . How can I do binary_search for this ? (NOT c++11/c++14) . I can write std::binary_search with search function of greater (which ideally means check of equality) but that would be wrong ? Whats the right strategy for fast searching ?

A binary search can only be done if the vector is in sorted order according to the binary search's predicate, by definition.
So, unless all elements in your vector for which "S.attribute1 > someVariable" are located after all elements that are not, this is going to be a non-starter, right out of the gate.
If all elements in your vector are sorted in some other way, that "some other way" is the only binary search that can be implemented.
Assuming that they are, you must be using a comparator, of some sort, that specifies strict weak ordering on the attribute, in order to come up with your sorted vector in the first place:
class comparator {
public:
bool operator()(const your_class &a, const your_class &b) const
{
return a.attribute1 < b.attribute1;
}
};
The trick is that if you want to search using the attribute value alone, you need to use a comparator that can be used with std::binary_search which is defined as follows:
template< class ForwardIt, class T, class Compare >
bool binary_search( ForwardIt first, ForwardIt last,
const T& value, Compare comp );
For std::binary_search to succeed, the range [first, last) must be
at least partially ordered, i.e. it must satisfy all of the following
requirements:
for all elements, if element < value or comp(element, value) is true
then !(value < element) or !comp(value, element) is also true
So, the only requirement is that comp(value, element) and comp(element, value) needs to work. You can pass the attribute value for T, rather than the entire element in the vector to search for, as long as your comparator can deal with it:
class search_comparator {
public:
bool operator()(const your_class &a, const attribute_type &b) const
{
return a.attribute1 < b;
}
bool operator()(const attribute_type &a, const your_class &b) const
{
return a < b.attribute1;
}
};
Now, you should be able to use search_comparator instead of comparator, and do a binary search by the attribute value.
And, all bets are off, as I said, if the vector is not sorted by the given attribute. In that case, you'll need to use std::sort it explicitly, first, or come up with some custom container that keeps track of the vector elements, in the right order, separately and in addition to the main vector that holds them. Using pointers, perhaps, in which case you should be able to execute a binary search on the pointers themselves, using a similar search comparator, that looks at the pointers, instead.

For std::binary_search to succeed, the range need to be sorted.std::binary_search, std::lower_bound works on sorted containers. So every time you add a new element into your vector you need to keep it sorted.
For this purpose you can use std::lower_bound in your insertion:
class X;
class XCompare
{
public:
bool operator()(const X& first, const X& second) const
{
// your sorting logic
}
};
X value(...);
auto where = std::lower_bound(std::begin(vector), std::end(vector), value, XCompare());
vector.insert(where, value);
And again you can use std::lower_bound to search in your vector:
auto where = std::lower_bound(std::begin(vector), std::end(vector), searching_value, XCompare());
Don't forget to check if std::lower_bound was successful:
bool successed = where != std::end(vector) && !(XCompare()(value, *where));
Or directly use std::binary_search if you only want to know that element is in vector.

multimap with custom keys - comparison function

bool operator<(const Binding& b1, const Binding& b2)
{
if(b1.r != b2.r && b1.t1 != b2.t1)
{
if(b1.r != b2.r)
return b1.r < b2.r;
return b1.t1 < b2.t1;
}
return false;
}
I have a comparison function like above. Basically, I need to deem the objects equal if one of their attribute matches. I am using this comparison function for my multimap whose key is 'Binding' object.
The problem I face is that lower_bound and upper_bound functions return the same iterator which points to a valid object. For example (t1 = 1, r = 2) is already in the map and when I try to search it in the map with (t1 = 1, r = 2), I get a same iterator as return value of upper_bound and lower_bound functions.
Is anything wrong with the comparison function? Is there a way to figure a function where I can still ensure that the objects are equivalent even if just one of their field matches?
Shouldn't the upper_bound iterator return the object past the

The comparator for a map or multimap is expected to express a strict weak ordering relation between the set of keys. Your requirement "two objects are equivalent if just one of their fields matches" cannot be such a relation. Take these three keys:
1: r=1, t1=10
2: r=1, t1=42
3: r=2, t1=42
clearly, keys 1 and 2 are equivalent, because they have the same r. Likewise, 2 and 3 are equivalent because of the same t1. That means, that 1 and 3 have to be equivalent as well, although they have no matching fields.
As a corollary, all possible keys have to be equivalent under these circumstances, which means you dont have any ordering at all and a multimap is not the right way to go.
For your case, Boost.MultiIndex comes to mind. You could then have two separate indices for r and t1 and do your lower_bound, upper_bound and equal_range searches over both indices separately.

Your comparision function after removing redundant code can be re-written as
bool operator<(const Binding& b1, const Binding& b2)
{
if(b1.r != b2.r && b1.t1 != b2.t1)
{
//if(b1.r != b2.r) // always true
return b1.r < b2.r;
//return b1.t1 < b2.t1; // Never reached
}
return false;
}
Or by de-morgan's law
bool operator<(const Binding& b1, const Binding& b2)
{
if(b1.r == b2.r || b1.t1 == b2.t1) return false;
else return b1.r < b2.r;
}
This does not guarantee a < c if a < b and b < c
Ex: Binding(r, t): a(3, 5), b(4, 6), c(5, 5)
If your comparision function doesn't follow above crieteria, you may get strange results. (including infinite loops in some cases if library is not robust)

Your comparison function will return false if either the rs or the ts match because of the && in the if() clause. Did you mean || ? Replacing the && with || would give you a valid comparison function which compare first by the r field then the t field.
Note that std::pair already has a comparison function that does exactly that.
Your text below your code though states:
Basically, I need to deem the objects equal if one of their attribute matches
You cannot do that as it wouldn't be transitive (thus you wouldn't have strict ordering).
The inside of your if block has another if that is certain to be true, as the && clause means both sides are true.

search part of an item in a set

is there a way to search a part of an item in a set? I have a set of pairs std::set< std::pair<double, unsigned> > and want to search for an item via a given double. Is there any way I can do this convenient instead of manually creating a pair and searching for it?

Given that your set uses the standard comparison operator to define the ordering within the set, and given that the double element comes first in the pair, the sorting order inside the set is defined primarily by the double elements of the pair (and only for pairs that share the double element will the second element be taken into account for the ordering).
Therefore, the only thing you need to do is to define a comparison operator that compares pairs with single doubles in both directions (note I use C++11 syntax in several places):
using std::pair;
using std::set;
typedef pair<double,unsigned> pair_t;
typedef set<pair_t> set_t;
typedef set_t::iterator it_t;
struct doublecmp
{
/* Compare pair with double. */
bool operator()(const pair_t &p, double d)
{ return (p.first < d); }
/* Compare double with pair. */
bool operator()(double d, const pair_t &p)
{ return (d < p.first); }
};
And with this in place, you can use the std::equal_range algorithm to find the range of all pairs in the set that have a given double d as first element:
std::equal_range(begin(s),end(s),d,doublecmp());
If this is compiled with optimization, the instantiation of doublecmp() is a no-op.
You'll find a fully working code example here.
Why does this work?
Given that your set is declared as set<pair<double,unsigned>>, you are using the default comparison operator less<pair<double,unsigned>>, which is the same as the standard operator< for pair. That is defined as the lexicographic ordering (20.3.3/2 in C++11, or 20.2.2/2 in C++03), therefore the first element of each pair is the primary sorting criterion.
Caveat 1 The solution will generally not work if you declare your set to use a different comparison operator than the default one. It also won't work if the part of the pair you use as searching criterion is the second, rather than the first element.
Caveat 2 The data type used in the search criterion is a floating point type. Equality checks (including the operator<-based indirect equality checking performed by std::equal_range) for floating point numbers are generally difficult. The double you are searching for may have been computed in a way that suggests it should be mathematically identical to certain values in the set, but std::equality_range might not find them (nor would std::find_if suggested in other answers). For equality checks it is generally a good idea to allow for a small ("up to some epsilon") difference between the value you are looking for and the values you consider as matches. You can accomplish this by replacing std::equal_range with explicit calls to std::lower_bound and std::upper_bound and taking into account a parameter epsilon:
pair<it_t,it_t> find_range(set_t &s, double d, double epsilon)
{
return {std::lower_bound(begin(s),end(s),d - epsilon,doublecmp()),
std::upper_bound(begin(s),end(s),d + epsilon,doublecmp())};
}
This leaves the question how to determine the right value for epsilon. This is generally difficult. It is usually computed as an integer multiple of std::numeric_limits<double>::epsilon, but choosing the right factor can be tricky. You'll find more information about this in How dangerous is it to compare floating point values.

Since the set isn't ordered according to your search criteria, you could use std::find_if with a predicate that checks only the pair's first element. This will return an iterator to the first matching element, with the usual caveats about comparing floating point numbers for equality.
double value = 42.;
auto it = std::find_if(the_set.begin(), the_set.end(),
[&value](const std::pair<double, unsigned>& p) { return p.first==value; });

I'm not sure if this is what you're looking for or not:
#include <iostream>
#include <set>
#include <utility>
#include <algorithm>
using namespace std;
struct Finder{
template<typename Value>
bool operator()(const Value& first, const Value& v) const{
return first == v;
}
};
template <typename Value>
struct FirstValueValue{
FirstValueValue(const Value& value): value(value){};
template<typename Pair>
bool operator()(const Pair& p) const{
return p.first == value;
}
Value value;
};
int main(int argc, char *argv[]) {
typedef std::set<std::pair<double,unsigned int> > SetOfPairs;
SetOfPairs myset;
myset.insert(std::make_pair(2.0,1));
myset.insert(std::make_pair(5.7,2));
Finder finder;
double v = 2.0;
for(SetOfPairs::iterator it = myset.begin(); it != myset.end(); it++){
if( finder(it->first,v) ){
cout << "found value " << v << std::endl;
}
}
FirstValueValue<double> find_double_two(2.0);
myset.insert(std::make_pair(2.0,100));
unsigned int count = std::count_if(myset.begin(),myset.end(),find_double_two);
cout << "found " << count << " occurances of " << find_double_two.value;
}
Which prints out:
found value 2
found 2 occurances of 2
I don't know what your needs are or if boost libraries are allowed but you could look into Boost Multi Index if you have to index off one part of the pair a lot.
Hope this helps. Good luck.

How does STL map know, that map contains a given element?

In question Using char as key in stdmap there is advised to use custom compare function/functor:
struct cmp_str
{
bool operator()(char const *a, char const *b)
{
return std::strcmp(a, b) < 0;
}
};
map<char *, int, cmp_str> BlahBlah;
This allows map to detect if key A is less than key B. But for example map<>::find() returns end if element is not found, and iterator to it if it is found. So map knows about equivalence, not only less-than. How?

The equality condition for two keys a and b are that a<b and b<a are both false. The map itself is commonly implemented as a balanced binary tree*, so the less-than comparison is used to traverse the map from the root node until the matching element is found. When searching for a key k, less-than comparison is used until the first element for which the comparison is false is found. If the inverse comparison is also false, k has been found. Otherwise, k is not in the map. The map only uses the less-than comparison to this purpose.
Note also that std::set uses exactly the same mechanism, the only difference being that each element is it's own key.
* strictly speaking, the C++ standard does not specify that std::map be a balanced binary tree, but the complexity constraints it places on operations such as insertion and look-up mean that implementations chose structures such as red-black tree.

Equivalence/operator== can be expressed as a function of operator<:
bool operator==(T left, T right) {
return !(left < right) && !(right < left);
}

This is because the comparator of the map must implement a strict weak ordering, such as <.
One of the mathematical properties of such a relation is Antisymmetry, which states that for any x and y, then not (x < y) and not (y < x) implies x == y.
Therefore, after finding the first element not to compare smaller than the key you are searching for, the implementation simply checks if that element compares greater, and it's neither smaller nor greater, then it must be equal.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Comparator for matching point in a range - c++

Related

How does std::set comparator function work?

Binary search on vector objects of an element greater than an attribute

multimap with custom keys - comparison function

search part of an item in a set

How does STL map know, that map contains a given element?

Categories

Resources