How to find distinct pairs using std::ranges? [duplicate] - c++

I have been trying to understand the new ranges library and try to convert some of the more traditional for loops into functional code. The example code given by cppreference is very straight forward and readable. However, I am unsure how to apply Ranges over a vector of Points that needs to have every x and y values looked at, calculated, and compared at the end for which is the greatest distance.
struct Point
{
double x;
double y;
}
double ComputeDistance(const Point& p1, const Point& p2)
{
return std::hypot(p1.x - p2.x, p1.y - p2.y);
}
double GetMaxDistance(const std::vector<Point>& points)
{
double maxDistance = 0.0;
for (int i = 0; i < points.size(); ++i)
{
for(int j = i; j < points.size(); ++j)
{
maxDistance = std::max(maxDistance, ComputeDistance(points.at(i),points.at(j)));
}
}
return maxDistance;
}
GetMaxDistance is the code that I would love to try and clean up and apply ranges on it. Which I thought would be as simple as doing something like:
double GetMaxDistance(const std::vector<Point>& points)
{
auto result = points | std::views::tranform(ComputeDistance);
return static_cast<double>(result);
}
And then I realized that was not correct since I am not passing any values into the function. So I thought:
double GetMaxDistance(const std::vector<Point>& points)
{
for(auto point : points | std::views::transform(ComputeDistance))
// get the max distance somehow and return it?
// Do I add another for(auto nextPoint : points) here and drop the first item?
}
But then I realized that I am applying that function to every point, but not the point next to it, and this would also not work since I am still only passing in one argument into the function ComputeDistance. And since I need to compute the distance of all points in the vector I have to compare each of the points to each other and do the calculation. Leaving it as an n^2 algorithm. Which I am not trying to beat n^2, I would just like to know if there is a way to make this traditional for loop take on a modern, functional approach.
Which brings us back to the title. How do I apply std::ranges in this case? Is it even possible to do with what the standard has given us at this point? I know more is to be added in C++23. So I don't know if this cannot be achieved until that releases or if this is not possible to do at all.
Thanks!

The algorithm you're looking for is combinations - but there's no range adaptor for that (neither in C++20 nor range-v3 nor will be in C++23).
However, we can manually construct it in this case using an algorithm usually called flat-map:
inline constexpr auto flat_map = [](auto f){
return std::views::transform(f) | std::views::join;
};
which we can use as follows:
double GetMaxDistance(const std::vector<Point>& points)
{
namespace rv = std::views;
return std::ranges::max(
rv::iota(0u, points.size())
| flat_map([&](size_t i){
return rv::iota(i+1, points.size())
| rv::transform([&](size_t j){
return ComputeDistance(points[i], points[j]);
});
}));
}
The outer iota is our first loop. And then for each i, we get a sequence from i+1 onwards to get our j. And then for each (i,j) we calculate ComputeDistance.
Or if you want the transform at top level (arguably cleaner):
double GetMaxDistance(const std::vector<Point>& points)
{
namespace rv = std::views;
return std::ranges::max(
rv::iota(0u, points.size())
| flat_map([&](size_t i){
return rv::iota(i+1, points.size())
| rv::transform([&](size_t j){
return std::pair(i, j);
});
})
| rv::transform([&](auto p){
return ComputeDistance(points[p.first], points[p.second]);
}));
}
or even (this version produces a range of pairs of references to Point, to allow a more direct transform):
double GetMaxDistance(const std::vector<Point>& points)
{
namespace rv = std::views;
namespace hof = boost::hof;
return std::ranges::max(
rv::iota(0u, points.size())
| flat_map([&](size_t i){
return rv::iota(i+1, points.size())
| rv::transform([&](size_t j){
return std::make_pair(
std::ref(points[i]),
std::ref(points[j]));
});
})
| rv::transform(hof::unpack(ComputeDistance)));
}
These all basically do the same thing, it's just a question of where and how the ComputeDistance function is called.
C++23 will add cartesian_product and chunk (range-v3 has them now) , and just recently added zip_transform, which also will allow:
double GetMaxDistance(const std::vector<Point>& points)
{
namespace rv = std::views;
namespace hof = boost::hof;
return std::ranges::max(
rv::zip_transform(
rv::drop,
rv::cartesian_product(points, points)
| rv::chunk(points.size()),
rv::iota(1))
| rv::join
| rv::transform(hof::unpack(ComputeDistance))
);
}
cartesian_product by itself would give you all pairs - which both includes (x, x) for all x and both (x, y) and (y, x), neither of which you want. When we chunk it by points.size() (produces N ranges of length N), then we repeatedly drop a steadingly increasing (iota(1)) number of elements... so just one from the first chunk (the pair that contains the first element twice) and then two from the second chunk (the (points[1], points[0]) and (points[1], points[1]) elements), etc.
The zip_transform part still produces a range of chunks of pairs of Point, the join reduces it to a range of pairs of Point, which we then need to unpack into ComputeDistance.
This all exists in range-v3 (except zip_transform there is named zip_with). In range-v3 though, you get common_tuple, which Boost.HOF doesn't support, but you can make it work.

Related

More efficient way to get indices of a binary mask in Eigen3?

I currently have a bool mask vector generated in Eigen. I would like to use this binary mask similar as in Python numpy, where depending on the True value, i get a sub-matrix or a sub-vector, where i can further do some calculations on these.
To achieve this in Eigen, i currently "convert" the mask vector into another vector containing the indices by simply iterating over the mask:
Eigen::Array<bool, Eigen::Dynamic, 1> mask = ... // E.G.: [0, 1, 1, 1, 0, 1];
Eigen::Array<uint32_t, Eigen::Dynamic, 1> mask_idcs(mask.count(), 1);
int z_idx = 0;
for (int z = 0; z < mask.rows(); z++) {
if (mask(z)) {
mask_idcs(z_idx++) = z;
}
}
// do further calculations on vector(mask_idcs)
// E.G.: vector(mask_idcs)*3 + another_vector
However, i want to further optimize this and am wondering if Eigen3 provides a more elegant solution for this, something like vector(from_bin_mask(mask)), which may benefit from the libraries optimization.
There are already some questions here in SO, but none seems to answer this simple use-case
(1, 2). Some refer to the select-function, which returns an equally sized vector/matrix/array, but i want to discard elements via a mask and only work further with a smaller vector/matrix/array.
Is there a way to do this in a more elegant way? Can this be optimized otherwise?
(I am using the Eigen::Array-type since most of the calculations are element-wise in my use-case)
As far as I'm aware, there is no "out of the shelf" solution using Eigen's methods. However it is interesting to notice that (at least for Eigen versions greater or equal than 3.4.0), you can using a std::vector<int> for indexing (see this section). Therefore the code you've written could simplified to
Eigen::Array<bool, Eigen::Dynamic, 1> mask = ... // E.G.: [0, 1, 1, 1, 0, 1];
std::vector<int> mask_idcs;
for (int z = 0; z < mask.rows(); z++) {
if (mask(z)) {
mask_idcs.push_back(z);
}
}
// do further calculations on vector(mask_idcs)
// E.G.: vector(mask_idcs)*3 + another_vector
If you're using c++20, you could use an alternative implementation using std::ranges without using raw for-loops:
int const N = mask.size();
auto c = iota(0, N) | filter([&mask](auto const& i) { return mask[i]; });
auto masked_indices = std::vector(begin(c), end(c));
// ... Use it as vector(masked_indices) ...
I've implemented some minimal examples in compiler explorer in case you'd like to check out. I honestly wished there was a simpler way to initialize the std::vector from the raw range, but it's currently not so simple. Therefore I'd suggest you to wrap the code into a helper function, for example
auto filtered_indices(auto const& mask) // or as you've suggested from_bin_mask(auto const& mask)
{
using std::ranges::begin;
using std::ranges::end;
using std::views::filter;
using std::views::iota;
int const N = mask.size();
auto c = iota(0, N) | filter([&mask](auto const& i) { return mask[i]; });
return std::vector(begin(c), end(c));
}
and then use it as, for example,
Eigen::ArrayXd F(5);
F << 0.0, 1.1548, 0.0, 0.0, 2.333;
auto mask = (F > 1e-15).eval();
auto D = (F(filtered_indices(mask)) + 3).eval();
It's not as clean as in numpy, but it's something :)
I have found another way, which seems to be more elegant then comparing each element if it equals to 0:
Eigen::SparseMatrix<bool> mask_sparse = mask.matrix().sparseView();
for (uint32_t k = 0; k<mask.outerSize(); ++k) {
for (Eigen::SparseMatrix<bool>::InnerIterator it(mask_sparse, k); it; ++it) {
std::cout << it.row() << std::endl; // row index
std::cout << it.col() << std::endl; // col index
// Do Stuff or built up an array
}
}
Here we can at least build up a vector (or multiple vectors, if we have more dimensions) and then later use it to "mask" a vector or matrix. (This is taken from the documentation).
So applied to this specific usecase, we simply do:
Eigen::Array<uint32_t, Eigen::Dynamic, 1> mask_idcs(mask.count(), 1);
Eigen::SparseVector<bool> mask_sparse = mask.matrix().sparseView();
int z_idx = 0;
for (Eigen::SparseVector<bool>::InnerIterator it(mask_sparse); it; ++it) {
mask_idcs(z_idx++) = it.index()
}
// do Stuff like vector(mask_idcs)*3 + another_vector
However, i do not know which version is faster for large masks containing thousands of elements.

Eigen Matrix sum with NANs

I have an Eigen Matrix A which includes NAN values. I want to get the sum of differences of this matrix to multiple other matrices.
double getDistance(const Eigen::MatrixXf& from, const Eigen::MatrixXf& to)
{
Eigen::MatrixXf difference = (to - from).cwiseAbs2();
difference = difference.unaryExpr([](float v, double& sum)
{ return std::isnan(v) ? 0.0f : v;});
double distance = difference.sum();
return distance;
}
std::vector<double> getDistances(const std::vector<Eigen::MatrixXf>& from, const Eigen::MatrixXf& to)
{
std::vector<double> distances;
for (int i = 0; i < from.size(); ++i)
{
distances.push_back(getDistance(from[i], to));
}
return distances;
}
Right now I need to remove the NANs of difference every single time and then take the sum.
I was thinking about doing my own sum function which skips NANs.
Is there an elegant way to do this?
Does unaryExpr work for summing up where we need an "out parameter"?
I would recommend to follow starmole recommendation first, but to answer the question isNaN and select are for you:
return (to-from).array().isNaN().select(0,to-from).squaredNorm();
With the release of Eigen 3.4, handling NaN propagation got improved.

using C++ priority_queue comparator correctly

This question was asked in an interview recently
public interface PointsOnAPlane {
/**
* Stores a given point in an internal data structure
*/
void addPoint(Point point);
/**
* For given 'center' point returns a subset of 'm' stored points that are
* closer to the center than others.
*
* E.g. Stored: (0, 1) (0, 2) (0, 3) (0, 4) (0, 5)
*
* findNearest(new Point(0, 0), 3) -> (0, 1), (0, 2), (0, 3)
*/
vector<Point> findNearest(vector<Point> points, Point center, int m);
}
This is following approach I used
1) Create a max heap priority_queue to store the closest points
priority_queue<Point,vector<Point>,comp> pq;
2) Iterate the points vector and push a point if priority queue size < m
3) If size == m then compare the queue top with current point and pop if necessary
for(int i=0;i<points.size();i++)
{
if(pq.size() < m)
{
pq.push(points[i]);
}
else
{
if(compareDistance(points[i],pq.top(),center))
{
pq.pop();
pq.push(points[i]);
}
}
}
4) Finally put the contents of priority queue in a vector and return.
How should I write the comp and the compareDistance comparator which will allow me to store m points initially and then compare the current point with the one on top?
I think your approach can be changed so that it uses the priority_queue in a different way. The code becomes a bit complex since there's an if-statement in the for loop, and this if-statement controls when to add to the priority_queue. Why not add all the points to the priority_queue first, and then pop out m points? Let the priority_queue do all the work.
The key to implementing the findNearest function using a priority_queue is to realize that the comparator can be a lambda that captures the center parameter. So you can do something like so:
#include <queue>
#include <vector>
using namespace std;
struct Point { int x, y; };
constexpr int distance(const Point& l, const Point& r)
{
return (l.x - r.x)*(l.x - r.x) + (l.y - r.y)*(l.y - r.y);
}
vector<Point> findNearest(const vector<Point>& points, Point center, int m)
{
auto comparator = [center](const Point& l, const Point& r) {
return distance(l, center) > distance(r, center);
};
priority_queue<Point, vector<Point>, decltype(comparator)> pq(comparator);
for (auto&& p : points) {
pq.emplace(p);
}
vector<Point> result;
for (int i = 0; i < m; ++i) {
result.push_back(pq.top());
pq.pop();
}
return result;
}
In an interview setting it's also good to talk about the flaws in the algorithm.
This implementation runs in O(nlogn). There's going to be a clever algorithm that will beat this run time, especially since you only need the closest m points.
It uses O(n) more space because of the queue, and we should be able to do better. What's really happening in this function is a sort, and sorts can be implemented in-place.
Prone to integer overflow. A good idea would be use a template on the Point struct. You can also use a template to make the points container generic in the findNearest function. The container just has to support iteration.

parallel_reduce on double returning incorrect result

I am trying to use Intel TBB parallel_reduce to obtain the sum of array elements consisting of doubles. However the result is different compared to OpenMP reduction implementation.
Here is the OpenMP one:
double dAverageTemp = 0.0;
#pragma omp parallel for reduction(+:dAverageTemp)
for (int i = 0; i < sCartesianSize; i++)
dAverageTemp += pdTempCurr[i];
This code returns the correct value which is "317.277493"; however this TBB code:
double dAverageTemp = tbb::parallel_reduce(tbb::blocked_range<double*>(pdTempCurr, pdTempCurr + sCartesianSize - 1),
0.0,
[](const tbb::blocked_range<double*> &r, double value) -> double {
return std::accumulate(r.begin(), r.end(), value);
},
std::plus<double>()
);
insists that the result is "317.277193".
What am I missing here?
Although all comments about the order of summations are perfectly correct, the simple truth here is you have a bug in your code. All std::, thrust:: and tbb:: algorithms or constructors abide to the same philosophy when it comes to define ranges, which is to indicate from first element to take to first element not to take, like in a for ( auto it = v.begin(); it < v.end(); it++)
Therefore, here, your code for tbb::blocked_range should go up to pdTempCurr + sCartesianSize, not to pdTempCurr + sCartesianSize - 1.
It should become:
double dAverageTemp = tbb::parallel_reduce(tbb::blocked_range<double*>(pdTempCurr, pdTempCurr + sCartesianSize ),
0.0,
[](const tbb::blocked_range<double*> &r, double value) -> double {
return std::accumulate(r.begin(), r.end() value);
},
std::plus<double>()
);
My (wild) guess is that pdTempCurr[sCartesianSize-1] is around 0.0003 which will account for the numerical difference experienced.

How do I delete the closest "Point" object in a STD::List to some x,y?

I have a point class like:
class Point {
public:
int x, y;
Point(int x1, int y1)
{
x = x1;
y = y1;
}
};
and a list of points:
std::list <Point> pointList;
std::list <Point>::iterator iter;
I'm pushing points on to my pointList (although the list might contain no Points yet if none have been pushed yet).
I have two questions:
How can I delete the closest point to some arbitrary (x, y) from the list?
Lets say I have the x,y (5,12) and I want to find the Point in the list closest to that point and remove it from the STD::List.
I know I'll have to use the distance formula and I'll have to iterate through the list using an iterator but I'm having some trouble conceptualizing how I'll keep track of which point is the closest as I iterate through the list.
How can I return an array or list of points within x radius of a given (x,y)?
Similar to the last question except I need a list of pointers to the "Point" objects within say 5 radius of a given (x,y). Also, should I return an array or a List?
If anyone can help me out, I'm still struggling my way through C++ and I appreciate it.
Use a std::list::iterator variable to keep track of the closest point as you loop through the list. When you get to the end of the list it will contain the closest point and can be used to erase the item.
void erase_closest_point(const list<Point>& pointList, const Point& point)
{
if (!pointList.empty())
{
list<Point>::iterator closestPoint = pointList.begin();
float closestDistance = sqrt(pow(point.x - closestPoint->x, 2) +
pow(point.y - closestPoint->y, 2));
// for each point in the list
for (list<Point>::iterator it = closestPoint + 1;
it != pointList.end(); ++it)
{
const float distance = sqrt(pow(point.x - it->x, 2) +
pow(point.y - it->y, 2));
// is the point closer than the previous best?
if (distance < closestDistance)
{
// replace it as the new best
closestPoint = it;
closestDistance = distance
}
}
pointList.erase(closestPoint);
}
}
Building a list of points within a radius of a given point is similar. Note that an empty radius list is passed into the function by reference. Adding the points to the list by reference will eliminate the need for copying all of the points when returning the vector by value.
void find_points_within_radius(vector<Point>& radiusListOutput,
const list<Point>& pointList,
const Point& center, float radius)
{
// for each point in the list
for (list<Point>::iterator it = pointList.begin();
it != pointList.end(); ++it)
{
const float distance = sqrt(pow(center.x - it->x, 2) +
pow(center.y - it->y, 2));
// if the distance from the point is within the radius
if (distance > radius)
{
// add the point to the new list
radiusListOutput.push_back(*it);
}
}
}
Again using copy if:
struct RadiusChecker {
RadiusChecker(const Point& center, float radius)
: center_(center), radius_(radius) {}
bool operator()(const Point& p)
{
const float distance = sqrt(pow(center_.x - p.x, 2) +
pow(center_.y - p.y, 2));
return distance < radius_;
}
private:
const Point& center_;
float radius_;
};
void find_points_within_radius(vector<Point>& radiusListOutput,
const list<Point>& pointList,
const Point& center, float radius)
{
radiusListOutput.reserve(pointList.size());
remove_copy_if(pointList.begin(), pointList.end(),
radiusListOutput.begin(),
RadiusChecker(center, radius));
}
Note that the sqrt can be removed if you need extra performance since the square of the magnitude works just as well for these comparisons. Also, if you really want to increase performance than consider a data structure that allows for scene partitioning like a quadtree. The first problem is closely related to collision detection and there is a ton of valuable information about that topic available.
You are right on how it should be made. Just iterate through all items in the list and keep track of the smallest distance already found, and the nearest point you found in two variables, making sure you don't match the point with itself if the problem states so. Then just delete the point you found.
How this is exactly made is kept as an exercise.
If you want to get a list of points in a given radius from another point, iterate the list and build a second list containing only the points within the specified range.
Again, how it's made in code is left to you as an exercise.
You can do this using a combination of the STL and Boost.Iterators and Boost.Bind -- I'm pasting the whole source of the solution to your problem here for your convenience:
#include <list>
#include <cmath>
#include <boost/iterator/transform_iterator.hpp>
#include <boost/bind.hpp>
#include <cassert>
using namespace std;
using namespace boost;
struct Point {
int x, y;
Point() : x(0), y(0) {}
Point(int x1, int y1) : x(x1), y(y1) {}
Point(Point const & other) : x(other.x), y(other.y) {}
Point & operator=(Point rhs) { rhs.swap(*this); return *this; }
void swap(Point & other) { std::swap(other.x, x); std::swap(other.y, y); }
};
double point_distance(Point const & first, Point const & second) {
double x1 = first.x;
double x2 = second.x;
double y1 = first.y;
double y2 = second.y;
return sqrt( ((x2 - x1) * (x2 -x1)) + ((y2 - y1) * (y2 - y1)) );
}
int main(int argc, char * argv[]) {
list<Point> points;
points.push_back(Point(1, 1));
points.push_back(Point(2, 2));
points.push_back(Point(3, 3));
Point source(0, 0);
list<Point>::const_iterator closest =
min_element(
make_transform_iterator(
points.begin(),
bind(point_distance, source, _1)
),
make_transform_iterator(
points.end(),
bind(point_distance, source, _1)
)
).base();
assert(closest == points.begin());
return 0;
}
The meat of the solution is to transform each element in the list using the transform iterator using the point_distance function and then get the minimum distance from all the distances. You can do this while traversing the list, and in the end reach into the transform_iterator to get the base iterator (using the base() member function).
Now that you have that iterator, you can replace the assert(closest == points.begin()) with points.erase(closest).
I agree with the previous solution, and just wanted to add another thought. Although your Point class isn't very large and so a copy isn't really a problem, you might consider using Point* for your list. This way, when you create your second list, you would store the pointer to the same class. The down-side of this would be if you were deleting from multiple lists without a "master" that manages all created points, you could either create a memory leak if you didn't delete the underlying class or accidentally delete a class that was still being used in another list. Something to consider, though, depending on how your system evolves.
You have to keep the iterator to delete it afterwards.
std::list<Point>::iterator closest;
std::list<Point>::iterator it = pointList.begin();
double min_dist=dist(your_point, *it);
++it;
for (; it != pointList.end(); ++it)
{
double actual_dist = dist(your_point, *it);
if (actual_dist < min_dist)
{
min_dist = actual_dist;
closest = it;
}
}
pointList.erase(closest);