C++ STL Binary Search (lower_bound, upper_bound) - c++

I have implemented a binary search like this:
typedef std::vector<Cell>::iterator CellVectorIterator;
typedef struct _Point {
char x,y;
} CoordinatePoint;
typedef struct _Cell {
...
CoordinatePoint coordinates;
} Cell;
struct CellEqualityByCoordinates
{
bool
operator()(const Cell& cell1, const Cell& cell2) const
{ return cell1.coordinates.x == cell2.coordinates.x && cell1.coordinates.y == cell2.coordinates.y; }
};
CellVectorIterator FindCellByCoordinates (CellVectorIterator first, CellVectorIterator last, const Cell &val)
{
return std::upper_bound(first, last, val, CellEqualityByCoordinates());
}
But it doesn't always find a value.
What's wrong with that?

Your comparison function will not work for a binary search. It is not supposed to determine equality, it is supposed to determine an order relation. Specifically, it should return true if the first argument would definitively come before the second in a sorted range. If the arguments should be considered equal, or the second would come before the first, it should return false. Your range also needs to be sorted by this same criteria in order for the binary search to work.
An example function that might work:
bool operator()(const Cell& cell1, const Cell& cell2) const
{
if (cell1.coordinates.x < cell2.coordinates.x) return true;
if (cell2.coordinates.x < cell1.coordinates.x) return false;
return cell1.coordinates.y < cell2.coordinates.y;
}
A similar example that doubles as a lesson in short-circuit boolean evaluation would be something like:
bool operator()(const Cell& cell1, const Cell& cell2) const
{
return (cell1.coordinates.x < cell2.coordinates.x) ||
(!(cell2.coordinates.x < cell1.coordinates.x) &&
cell1.coordinates.y < cell2.coordinates.y);
}
Both exhibit a property called strict weak ordering. It is frequently required for various sorting and/or searches in standard library collections and search algorithms.
Yet another example utilizes a std::pair, which already has a proper std::less overload available that does the above, and thus makes this considerably less complicated:
bool operator()(const Cell& cell1, const Cell& cell2) const
{
return std::make_pair(cell1.coordinates.x, cell1.coordinates.y) <
std::make_pair(cell2.coordinates.x, cell2.coordinates.y);
}
A similar algorithm is available for tuples via std::tie.
Of course, all of this assumes you have an actual ordered sequence in the first place, ordered by the same comparison logic. (which we can only assume is true, as no evidence of such was posted).

Related

Checking whether an element is in a C++ set is really slow

I'm implementing an algorithm that implies a lot of checking whether elements are in a set/list. I was using std::vector containers but time was increasing exponentially as the vector would grow.
I've decided I would try using std::set containers in order not to have to explore the entire container to know whether it contains a certain element.
I implemented the following function that checks whether an element is part of a given set:
bool in_set(set<Node> node_set){
return node_set.find(*this) != node_set.end();
}
However, that function is taking around 2s for very small sets (1-3 elements) which makes my entire algorithm unusable.
The custom class I'm using look like this:
class Node{
public:
int d;
int h_score;
int coordinates [3];
Node* parent_address;
};
The comparison operator that I implemented look like this:
bool operator<(Node other) const{
return concatenate(concatenate(this->coordinates[0], this->coordinates[1]), this->coordinates[2]) <
concatenate(concatenate(other.coordinates[0], other.coordinates[1]), other.coordinates[2]);
}
Edit: The concatenate function does not seem to take a lot of time while executing, it looks like this:
int concatenate(int i, int j) {
int result = 0;
for (int x = i; x <= j; x++) {
result = result * 10 + x;
}
return result;
}
Do you know why it is taking so much time, and more importantly, how to make it faster?
First of all, you can try to pass Set as const & and not in operator< also as const &.
bool in_set(const set<Node>& node_set){
return node_set.find(*this) != node_set.end();
}
And
bool operator<(const Node& other) const
It will use ref instead of a copy of your set and Node objects.
Do you know why it is taking so much time
concatenate(1, 100000000) takes 1.3 second on my raspberry pi, that way to do is too slow, and in fact useless
Note also that because of the possible overflows concatenate can give the same result for different nodes, this is non compatible for an operator<
how to make it faster?
you have to find something else than these calls of concatenate to implement your operator<
What is your need ? is the order in the set is important or it can be replaced by any one else ?
It is not mandatory to create a unique identifier to compare two nodes, compare them directly, for instance :
bool operator<(const Node & other) const{
if (coordinates[0] < other.coordinates[0])
return true;
if (coordinates[0] >= other.coordinates[0])
return false;
if (coordinates[1] < other.coordinates[1])
return true;
if (coordinates[1] >= other.coordinates[1])
return false;
return (coordinates[2] < other.coordinates[2]);
}
To understand that operator< works you can consider node.coordinates supports a big number having 3 times the size of an int, so I compare the higher bits, then if equals the medium bits, then if equals the lower bitsused for a set
Your operator< takes a copy of the Node. There's also no need to create strings to compare, the built-in tuple class can do that:
How about:
bool operator<(const Node& other) const {
return std::make_tuple(coordinates[0], coordinates[1], coordinates[2]) <
std::make_tuple(other.coordinates[0], other.coordinates[1], other.coordinates[2]);
}

Efficient way to remove duplicates

Following the answer in this thread "What's the most efficient way to erase duplicates and sort a vector?". I wrote the following code, but I got an error complaing no match for ‘operator<’ (operand types are ‘const connector’ and ‘const connector’) blahblah...
connector is a class I wrote myself, it basically is a line with two geometry points. uniqCntrs is a std::vector. It has 100% duplicates in it, which means each element has a duplicate, the size of uniqCntrs is quite big. What's wrong with my code, and how to deal with this situation?
std::set<connector> uniqCntrsSet;
for(unsigned int i = 0; i < uniqCntrs.size(); ++i )
{
uniqCntrsSet.insert(uniqCntrs[i]);
}
uniqCntrs.assign(uniqCntrsSet.begin(), uniqCntrsSet.end());
Edit:
I have no idea how to define < operator for my connector class. I mean it is physically meaningless to say one line is smaller than the other.
From cppreference:
std::set is an associative container that contains a sorted set of unique objects of type Key. Sorting is done using the key comparison function Compare.
The second template argument of std::set, Compare, is defaulted to std::less which by defaults compares the objects with operator<. To fix the issue you can simply define operator< for your Key type (connector that is).
Actually the operator< is just used to efficiently order the map which is used by std::set. It does not need to make any sense. The only requirement is that the operator satisfy the standard mathematical definition of a strict weak ordering.
Look at this point example:
class Point
{
public:
Point(int x, int y) : x(x), y(y) {
}
public:
bool operator==(const Point &other) const {
return x==other.x && y==other.y;
}
bool operator!=(const Point &other) const {
return !operator==(other);
}
bool operator<(const Point &other) const {
if (x==other.x) {
return y<other.y;
} else {
return x<other.x;
}
}
private:
int x;
int y;
};

Most effective way to compare two pairs of integers

A class contains two integers; there are two instances of this class. I want to compare them to ensure that the two instances contain the same two numbers (their orders don't matter).
I can do this:
bool operator==(const Edge &e, const Edge &f) {
return ((e.p1 == f.p1) || (e.p1 == f.p2)) && ((e.p2 == f.p1) || (e.p2 == f.p2));
}
Is this the best way there is? There will be many such comparisons so I want to make sure I make the most efficient choice. BTW, the operator will be primarily used by the std::unordered_set class - in case this information matters.
I think you have logic mixed up a bit... if I understand you correctly, given pairs (a,b) and (x,y), you want to check that (a,b) == s(x,y), for some permutation s?
bool operator==(const Edge &e, const Edge &f) {
return ((e.p1 == f.p1) && (e.p2 == f.p2)) ||
((e.p2 == f.p1) && (e.p1 == f.p2));
}
As for performance... there is nothing to optimize here. Go look somewhere else if your program is slow.
This is probably not the fastest, and it requires C++11. But it's nice and short:
bool operator==(const Edge& e, const Edge& f) {
return std::minmax(e.p1, e.p2) == std::minmax(f.p1, f.p2);
}
It also suggests an optimization (which I generally use): keep p1 and p2 in order so that minmax doesn't need to be called every time. Then you do have an optimal solution.
This will work ok for two. However, for any more, it obviously gets very ugly very fast. In fact, you'll need to do n! comparisons to check when you have n variables if you do this in the "naive" way.
An easier way is something like the following:
static constexpr Edge::number()
{
return <number_of_values>;
}
bool operator==(const Edge& e, const Edge& f)
{
constexpr size = Edge::number();
std::array<int, size> earr = {{e.p1, e.p2, ..., e.pn}};
std::array<int, size> farr = {{f.p1, f.p2, ..., f.pn}};
return std::is_permutation(earr.begin(), earr.end(), farr.begin());
}
If it is always two, you can simply write this as:
bool operator==(const Edge& e, const Edge& f)
{
std::array<int, 2> earr = {{e.p1, e.p2};
std::array<int, 2> farr = {{f.p1, f.p2}};
return std::is_permutation(earr.begin(), earr.end(), farr.begin());
}
Testing unordered equality is the same as testing if one sequence is a permutation of the other.
Edit: Which, as should be obvious to me, can be tested with by checking the sorted sequences are equal. Replace std::is_permutation with std::sort and std::equal in the above, which will be O(n log n) instead of O(n^2).
According to my understanding of your question I suggest you following solution:
Just Add on function in your class as below:
bool isExist(int point)
{
if(this.p1==point || this.p2==point)
return true;
else
return false;
}
You can apply your logic with calling this function.
Please correct me if I am wrong...

C++ std::set Find function overloading == operator

I am using sets. I use a custom struct as the key. I am inserting a value and trying to find the inserted value. But it never seems to find the element.
I have overridden both the == operator and the < operator.
Here is the code of the structure:
struct distance_t
{
public:
int id;
double distance;
bool operator<(const distance_t& rhs) const
{
if(distance < rhs.distance)
return true;
else
return false;
}
bool operator==( const distance_t& rhs)
{
if(id == rhs.id)
return true;
else
return false;
}
};
And this is the code of main
int main()
{
set<distance_t> currentSet;
distance_t insertDistance;
insertDistance.id =1;
insertDistance.distance = 0.5;
currentSet.insert(insertDistance);
distance_t findDistance;
findDistance.id = 1;
assert(currentSet.find(findDistance) != currentSet.end());
}
It always fails in the assert statement. What am I doing wrong?
Edit -Ok now I understand that it does not use the == operator at all. Here is what I want. I need the data structure to be ordered by distance. But I should be able to remove it using the id. Is there any clean way or already existing datastructure to do this?
It fails because your less-than comparison uses distance_t::distance, which you are not setting in findDistance:
distance_t findDistance;
findDistance.id = 1;
std::set does not use operator== for anything. It only uses operator<. So you would have to change it's logic to use distance_t::id.
If you want to search by id without changing the set's ordering, you can use std::find:
set<distance_t>::iterator it = std::find(currentSet.begin(),
currentSet.end(),
findDistance);
This will use your operator==. Bear in mind that this has linear time complexity.
Because operator== is not invoked at all. Comparing elements is like:
!(a < b) && !(b < a)
In other words, it uses operator<.
As you haven't assigned a value to findDistance.distance the result of the less then comparison is undefined.
Note that your definitions of the equality and less then comparison operators is dangerous, because it is easy to define instances of distance_t where their result is inconsistent. One example is two instances with the same distance but different id's.

std::map::find()

I have a simple struct which i'm using as a key in a std::map
struct PpointKey{
unsigned int xp,yp; //pixel coordinates
unsigned int side;
PpointKey(unsigned xp,unsigned yp,unsigned side=5):xp(xp),yp(yp),side(side)
{}
bool operator==(const PpointKey& other) const{
const unsigned int x = other.xp;
const unsigned int y = other.yp;
return ((x>=xp && x<=xp+side) && (y>=yp && y<=yp+side));
}
bool operator<(const PpointKey& other) const{
const unsigned int x = other.xp;
const unsigned int y = other.yp;
const unsigned other_distance_2 = x*x + y*y;
const unsigned this_distance_2 = this->xp*this->xp + this->yp * this->yp;
return this_distance_2 < other_distance_2;
}
};
What I would like to achieve is to use the find() to access the map with a key that has its xp,yp attributes within a side distance. In other words, if I have an (x,y) tuple, I would like to find inside the map the first PpointKey that fulfils the condition inside the operator== function
return ((x>=xp && x<=xp+side) && (y>=yp && y<=yp+side));
Is this possible using find? I'm getting map.end(), so I would like to check wheter the find () function uses the operator==. Maybe the search algorithm would be better?
Thanks in advance.
The find function of map does not use the operator==.
However you can use std::find, passing in the begin() and end() iterator of map. It will simply iterate through the sequence one at a time and yield the first object that matches (complexity is linear).
The issue you encounter is due to the fact that you have abused operator overload. The problem here is that the common definition of operator== is:
T operator==(T lhs, T rhs)
{
return !(lhs < rhs) && !(rhs < lhs);
}
And this is not the case with your definition, thus you cannot substitute one for the other.
It would be best if you used traditional functions with expressive names rather than operator overloading, it would be less misleading. Note that map and std::find allow you to pass suitable predicate objects, you don't need to overload the operators to use them.