I've written a kd-tree template, it's parameter being a natural number K.
As part of the template, I've written the following function to compute the distance between two points (kd_point is an alias for std::array)
template <unsigned K>
float kd_tree<K>::DistanceSq(const kd_point &P, const kd_point &Q)
{
float Sum = 0;
for (unsigned i = 0; i < K; i++)
Sum += (P[i] - Q[i]) * (P[i] - Q[i]);
return Sum;
}
I've turned "Enable C++ Core Check (Release)" on, and it gives me said warning. Is there a right way to write this routine to eliminate the warning?
Since you mention in comments that your kd_point's support range based iteration (so I assume can return iterators), you can re-write the function without the raw loop. Use named algorithms from the standard library instead:
template <unsigned K>
float kd_tree<K>::DistanceSq(const kd_point &P, const kd_point &Q)
{
return std::inner_product(
begin(P), end(P), begin(Q), 0.0f, std::plus<float>{},
[](float pi, float qi) {
return (pi - qi)*(pi - qi);
}
);
}
The standard library would be exempt from the warning, of course. If the (in this case) marginal benefit of replacing a raw loop by a named operation doesn't appeal to you, consider that if you ever come back to this code with a C++17 enabled compiler, you'll be able to almost effortlessly parallelize it:
template <unsigned K>
float kd_tree<K>::DistanceSq(const kd_point &P, const kd_point &Q)
{
return std::transform_reduce(std::execution::par, // Parallel execution enabled
begin(P), end(P), begin(Q), 0.0f, std::plus<float>{},
[](float pi, float qi) {
return (pi - qi)*(pi - qi);
}
);
}
Answer by StoryTeller is probably the most suitable C++ way to solve this particular task.
I would like to add that in general, if you want to iterate not over one, but over two sequences simultaneously, you can use "secret overload of boost::range::for_each", accepting two ranges:
#include <boost/range/algorithm_ext/for_each.hpp>
template <unsigned K>
float kd_tree<K>::DistanceSq(const kd_point &P, const kd_point &Q)
{
float Sum = 0;
boost::range::for_each(P, Q, [&Sum](float p, float q)
{
Sum += (p - q) * (p - q);
});
return Sum;
}
Note that similarly to standard algorithms, this algorithm is header-only and won't bring any library dependency to your code.
Related
So I am trying to compute (M by N matrix) times (N by 1 vector) operations with threads into a resulting vector. The question in my book says that I should think about how many threads to use, and I assume since the result matrix should be M by 1 then I should use M threads, one for each set of operations.
M is height, and N is width.
To create the threads I use
thread* myThreads = new thread[height];
Then I call the MatrixMultThreads function i times. At the end I join all the threads.
for (int i = 0; i < height; i++)
{
myThreads[i] = thread(MatrixMultThreads, my2DArray, vector, height, width);
}
for (int i = 0; i < height; i++)
{
myThreads[i].join();
}
What I am having trouble figuring out is how should I sum up all the resulting values in the correct order. How would I tell each specific thread what to do.
I was thinking, maybe I should create a global variable step_i and set it to 0, then each time the function is called I can iterate that variable. then since I can pass the width of the array, I go through each step_i and add arr[i][j] * vector[j]
What I am having trouble figuring out is how should I sum up all the
resulting values in the correct order.
They can be summed out-of-order, which is why this is a good problem to solve with multi-threading. If ordering matters to a specific problem, you can't improve it with multithreading (to be clear, if any sub-problem can be solved out-of-order then that sub-problem is a potential candidate for multithreading).
One solution to your problem is to set up a solution vector at the call site, then pass the corresponding element by reference (also the MatrixMultiply function needs to know which problem it's solving):
void MatrixMultiply(const Array2d& matrix,
const vector<int>& vec, int row, int& solution);
// ...
vector<int> result(height);
for (int i = 0; i < height; i++)
{
threads[i] = thread(MatrixMultiply, array2d, array1d, i, result[i]);
}
Your 2D array should really provide info on its height and width without having to pass these values explicitly.
BONUS INFO:
We could make this solution much more OOP in a way that you'll want to reuse for future problems (and some experienced programmers seem to miss this trick for using arrays):
MatrixMultiply function is really similar to a dot-product function:
template <typename V1, typename V2>
auto DotProduct(const V1& vec1, const V2& vec2)
{
auto result = vec1[0] * vec2[0];
for (size_t i = 1; i < vec1.size(); ++i)
result += vec1[i] * vec2[i];
return result;
}
template <typename V1, typename V2, typename T>
auto DotProduct(const V1& vec1, const V2& vec2, T& result)
{
result = DotProduct(vec1, vec2);
}
(The above allows the vectors to be any objects that uses size() and [] methods as expected.)
We can write a wrapper class around std::vector that can be used by our array class to handle all the indexing for us; like this:
template <typename T, typename A>
class SubVector
{
const typename std::vector<T,A>::iterator m_it;
const size_t m_size, m_interval_size;
public:
SubVector (std::vector<T,A>& v, size_t start, size_t sub_size, size_t i_size = 1)
: m_it(v.begin() + start), m_size(sub_size), m_interval_size(i_size)
{}
auto size () const
{
return m_size;
}
const T& operator [] (size_t i) const
{
return it[i*m_interval_size];
}
T& operator [] (size_t i)
{
return it[i*m_interval_size];
}
};
Then you could use this in some kind of Vectorise method in your array; like this:
template <typename T, typename A = std::allocator<T>>
class Array2D
{
std::vector<T,A> m_data;
size_t m_width, m_height;
public:
// your normal methods
auto VectoriseRow(int r) const
{
return SubVector(m_data, r*m_width, m_width);
}
auto VectoriseColumn(int c) const
{
return SubVector(m_data, c, m_height, m_width);
}
}
(Note: We could add the Vectorise feature to std::array or boost::multi_array by just writing a wrapper around them, which makes our array class more generic and saves us from having to do all the work. boost actually has this sort of feature inbuilt with array_view.)
Now our call site can be like so:
vector<int> result(height);
for (int i = 0; i < height; i++)
{
threads[i] = thread(DotProduct, array2d.VectoriseRow(i), array1d, result[i]);
}
This might seem like a more verbose way of solving the original problem (because it is), but if you use multi-dimensional arrays in your coding you'll find you no longer have to write multi-array-specific functions, or handle ugly indices for sub-problems (even in 1D problems, like Mean of Means). When dealing with those sorts of problems, you'll invariably want to reuse something like the above code.
You can store the results of the rows dot the Nx1 vector in a Mx1 vector and then do the sum.
By the way, you would be much better using OpenMP for such a problem, it would automatize most of your threads managements according to the number of cores on your machine, since here you might spawn a lot of threads:
https://www.openmp.org/
http://www.bowdoin.edu/~ltoma/teaching/cs3225-GIS/fall17/Lectures/openmp.html
I have in a function internal,
Eigen::SparseMatrix<double> & M;
if (M.IsRowMajor)
return my_func_template<Eigen::SparseMatrix<double,Eigen::RowMajor>&>(M,M.rows());
However, this does not compile, as the compiler does not believe M is an Eigen::SparseMatrix<double,Eigen::RowMajor>. How do I cast my reference as, specifically, Eigen::SparseMatrix<double,Eigen::RowMajor>, in the type-safe environment of C++11?
For example:
typedef Eigen::SparseMatrix<double> Smat;
typedef Eigen::SparseMatrix<double,Eigen::RowMajor> RMSmat;
typedef Eigen::SparseMatrix<double,Eigen::ColMajor> CMSmat;
enum direction { row, col};
template<class Mat>
vector<double> sum_along_inner(Mat &M){
vector<double> sums(M.innerSize(),0);
for(auto i = 0; i < M.outerSize(); i++){
for(typename M::InnerIterator it(M,i); it;++it){
sums[i] += it.value();
}
}
}
vector<double> sum_along_axis(Smat &M, direction dir){
// If I could solve this problem,
//
// I could also function off these if components,
// and re-use them for other order-dependent functions I write
// so that my top level functions are only about 2-4 lines long
if(dir == direction::row){
if(M.IsRowMajor)
return sum_along_inner<RMSmat>((my question) M);
//else
RMsmat Mrowmajor = M;
return sum_along_inner<RMSmat>(Mrowmajor);
}
else {
if(!M.IsRowMajor)
return sum_along_inner<CMSmat>(M);
// else
CMSmat Mcolmajor = M;
return sum_along_inner<CMSmat>((my_question) Mcolmajor);
}
}
And if I do more than just sum_along_axis, then the code complexity in terms of number of lines, readability, etc. is double what it needs to be if only I could solve this problem that I am asking about.
Otherwise, I can't abstract the loop, and I have to repeat it for column major and row major...because I can't just assume I wont call sum_along_axis from a function that hasn't already swapped the major-order from the default Eigen::ColMajor to Eigen::RowMajor...
Further, if I am operating at the order of mb-sized sparse matrices with dimensions too unwieldy to represent in dense matrix form, I am going to notice a major slowdown (which defeats the purpose of using a sparse matrix to begin with) if I don't write composable functions which are order agnostic, and transition the major-order only when needed.
So, unless I solve for this, my line count and/or function count, more or less, starts to go combinatorial.
As I wrote in my first comment M.IsRowMajor will always be false. This is because Eigen::SparseMatrix has always two template arguments, where the second defaults to Eigen::ColMajor
If you want to write a function which accepts both row- and column-major matrices, you need to write something like
template<int mode>
vector<double> sum_along_axis(Eigen::SparseMatrix<double,mode> const &M, direction dir)
if(dir == direction::row){
return sum_along_inner<RMSmat>(M); // implicit conversion if necessary
}
else {
return sum_along_inner<CMSmat>(M); // implicit conversion if necessary
}
}
You need to rewrite sum_along_inner to accept a const reference to make the implicit conversion work:
template<class Mat>
vector<double> sum_along_inner(Mat const &M){
vector<double> sums(M.outerSize(),0); // sums needs to have size M.outerSize()
for(auto i = 0; i < M.outerSize(); i++){
for(typename M::InnerIterator it(M,i); it;++it){
sums[i] += it.value();
}
}
}
If you want to avoid the conversion from row- to column-major (and vice versa) you should write a function which sums along the outer dimension and decide in your main function which function to call.
Mainly as an exercise I am implementing a conversion from base B to base 10:
unsigned fromBaseB(std::vector<unsigned> x,unsigned b){
unsigned out = 0;
unsigned pow = 1;
for (size_t i=0;i<x.size();i++){
out += pow * x[i];
pow *= b;
}
return out;
}
int main() {
auto z = std::vector<unsigned>(9,0);
z[3] = 1;
std::cout << fromBaseB(z,3) << std::endl;
}
Now I would like to write this using algorithms. E.g. using accumulate I could write
unsigned fromBaseB2(std::vector<unsigned> x,unsigned b){
unsigned pow = 1;
return std::accumulate(x.begin(),
x.end(),0u,
[pow,b](unsigned sum,unsigned v) mutable {
unsigned out = pow*v;
pow *= b;
return out+sum;
});
}
However, imho thats not nicer code at all. Actually it would be more natural to write it as an inner product, because thats just what we have to calculate to make the basis transformation. But to use inner_product I need an iterator:
template <typename T> struct pow_iterator{
typedef T value_type;
pow_iterator(T base) : base(base),value(1) {}
T base,value;
pow_iterator& operator++(){ value *= base;return *this; }
T operator*() {return value; }
bool operator==(const pow_iterator& other) const { return value == other.value;}
};
unsigned fromBaseB3(std::vector<unsigned> x,unsigned b){
return std::inner_product(x.begin(),x.end(),pow_iterator<unsigned>(b),0u);
}
Using that iterator, now calling the algorithm is nice an clean, but I had to write a lot of boilerplate code for the iterator. Maybe it is just my misunderstanding of how algorithms and iterators are supposed to be used... Actually this is just an example of a general problem I am facing sometimes: I have a sequence of numbers that is calculated based on a simple pattern and I would like to have a iterator that when dereferenced returns the corresponding number from that sequence. When the sequence is stored in a container I simply use the iterators provided by the container, but I would like to do the same, also when there is no container where the values are stored. I could of course try to write my own generic iterator that does the job, but isnt there something existing in the standard library that can help here?
To me it feels a bit strange, that I can use a lambda to cheat accumulate into calculating an inner product, but to use inner_product directly I have to do something extra (either precalculate the powers and store them in a container, or write an iterator ie. a seperate class).
tl;dr: Is there a easy way to reduce the boilerplate for the pow_iterator above?
the more general (but maybe too broad) question: Is it "ok" to use an iterator for a sequence of values that is not stored in a container, but that is calculated only if the iterator is dereferenced? Is there a "C++ way" of implementing it?
As Richard Hodges wrote in the comments, you can look at boost::iterator. Alternatively, there is range-v3. If you go with boost, there are a few possible ways to go. The following shows how to do so with boost::iterator::counting_iterator and boost::iterator::transform_iterator (C++ 11):
#include <iostream>
#include <cmath>
#include <boost/iterator/counting_iterator.hpp>
#include <boost/iterator/transform_iterator.hpp>
int main() {
const std::size_t base = 2;
auto make_it = [](std::size_t i) {
return boost::make_transform_iterator(
boost::make_counting_iterator(i),
[](std::size_t j){return std::pow(base, j);});};
for(auto b = make_it(0); b != make_it(10); ++b)
std::cout << *b << std::endl;
}
Here's the output:
$ ./a.out
1
2
4
8
16
32
64
128
256
512
I have two arrays comprising x,y vales for y=f(x). I would like to provide a function that finds the value of x that corresponds to either the min or max sampled value of y.
What is an efficient way to select proper comparison operator before looping over the values in the arrays?
For example, I would like to do something like the following:
double FindExtremum(const double* x, const double* y,
const unsigned int n, const bool isMin) {
static std::less<double> lt;
static std::greater<double> gt;
std::binary_function<double,double,bool>& IsBeyond = isMin ? lt : gt;
double xm(*x), ym(*y);
for (unsigned int i=0; i<n; ++i, ++x, ++y) {
if (IsBeyond()(*y,ym)) {
ym = *y;
xm = *x;
}
}
}
Unfortunately, the base class std::binary_function does not define a virtual operator().
Will a compiler like g++ 4.8 be able to optimize the most straight forward implementation?
double FindExtremum(const double* x, const double* y,
const unsigned int n, const bool isMin) {
double xm(*x), ym(*y);
for (unsigned int i=0; i<n; ++i, ++x, ++y) {
if ( ( isMin && (*y<ym)) ||
(!isMin && (*y>ym)) ) {
ym = *y;
xm = *x;
}
}
}
Is there another way to arrange things to make it easy for the compiler to optimize? Is there a well known algorithm for doing this?
I would prefer to avoid using a templated function, if possible.
You would need to pass the comparison functor as a templated function parameter, e.g.
template <typename Compare>
double FindExtremum(const double* x, const double* y,
const unsigned int n, Compare compare) {
double xm(*x), ym(*y);
for (unsigned int i=0; i<n; ++i, ++x, ++y) {
if (compare(*y,ym)) {
ym = *y;
xm = *x;
}
}
}
Then if you need runtime choice, write something like this:
if (isMin) {
FindExtremum(x, y, n, std::less<double>());
} else {
FindExtremum(x, y, n, std::greater<double>());
}
Avoiding a templated function is not really possible in this case. The best performing code will be one that embeds the comparison operation directly in the loop, avoiding a function call - you can either write a template or write two copies of this function. A templated function is clearly the better solution.
For ultimate efficiency, make the comparison operator or the comparison operator choice a template parameter, and don't forget to measure.
When striving for utmost micro-efficiency, doing virtual calls is not in the direction of the goal.
That said, this is most likely a case of premature optimization, which Donald Knuth described thusly:
“Premature optimization is the root of all evil”
(I omitted his reservations, it sounds more forceful that way. :-) )
Instead of engaging in micro-optimization frenzy, which gains you little if anything, and wastes your time, I recommend more productively trying to make the code as clear and provably correct as possible. For example, use std::vector instead of raw arrays and separately passed sizes. And, for example, don't call the boolean comparison operator compare, as recommended in another answer, since that's the conventional name for tri-valued compare (e.g. as in std::string::compare).
Some questions arise here. First, I think you're overcomplicating the situation. For example, it would be easier to have two functions, one that calculates the min and other that calculates the max, and then call either of them depending on the value of isMin.
More so, note how in each iteration you're making the test to see if isMin is true or not, (at least in the "optimized" code you show last) and that comparison could have been done just once.
Now, if isMin can be deduced in any way at compile time, you can use a template class that selects the correct implementation optimized for the case, and without any run-time overhead (not tested, written from memory):
template<bool isMin>
class ExtremeFinder
{
static float FindExtreme(const double* x, const double* y,
const unsigned int n)
{
// Version that calculates when isMin is false
}
};
template<>
class ExtremeFinder<true>
static float FindExtreme(const double* x, const double* y,
const unsigned int n)
{
// Version that calculates when isMin is true
}
};
and call it as ExtremeFinder<test_to_know_isMin>::FindExtreme(...);, or, if you cannot decide it at compile time, you can always do:
if (isMin_should_be_true)
ExtremeFinder<true>::FindExtreme(...);
else
ExtremeFinder<false>::FindExtreme(...);
If you had 2 disjunct criteria, e.g. < and >=, you could use a bool less function argument and use XOR in loop:
if (less ^ (a>=b))
Don't know about performance, but is easy to write.
Or not-covering-all-possibilities-disjunct < and >:
if ( (a!=b) && (less ^ (a>b) )
I'm wondering how to properly use pointers in for and while loops in C++. Usually I write using C instead of C++. The only reason I'm using the C++ std library this time is so I can use the complex number functions required by other mathematical functions in the code.
As part of the assignment we were given the following function declaration. The part that I wrote is commented within the function.
typedef std::complex<double> complex;
// Evaluates a polynomial using Horner's approach.
// Inputs:
// [coeffs, coeffs_end) - polynomial coefficients, ordered by descending power
// x - point of evaluation
// Outputs:
// p - value of polynomial at x
// dp - value of polynomial derivative at x
// ddp - value of polynomials second derivative at x
//
template<typename T>
inline void poly_val(T const* coeffs, T const* coeffs_end, T x, T & p, T & dp, T & ddp)
{
//MY CODE HERE
int i = 0;
const T *pnt = coeffs;
while(pnt != coeffs_end){
//Evaluate coefficients for descending powers
p += coeffs(i)*pow(x,((coeffs_end-1)-i));
pnt++;
i++;
}
}
The function doesn't know the length of the array, so I'm guessing the stop condition is the pointer 'coeffs_end', which points to the last value in the array 'coeffs'. Can I use a pointer in a conditional this way? (traditionally I would have fed the length of the array into the function, but we cant modify the declarations)
If I do it this way I keep get an error when compiling (which I don't get):
C2064:term foes not evaluate to a function taking 1 arguments
for the following line:
p += coeffs(i)*pow(x,((coeffs_end-1)-i));
coeffs(i) is calling convention to a function that takes an integer argument. But in your case it is an pointer. So, you need to use [] operator to access the element at it's index.
Also ((coeffs_end-1)-i) resolves to an address location. You need to dereference it to get the value at the location.
Maybe it'd be more readable to write this in a cleaner fashion:
#include <cmath>
#include <iterator>
template<typename T>
inline void poly_val(T const* coeffs, T const* coeffs_end, T x, T & p, T & dp, T & ddp)
{
const std::size_t nterms = std::distance(coeffs, coeffs_end);
for (std::size_t i = 0; i != nterms; ++i)
{
p += coeffs[i] * std::pow(x, nterms - 1 - i);
}
}
Since raw pointers can be treated as iterators, we can use std::distance to determine the size of an array bounded by a range [first, last).
Edit: Acutally it can be done even easier:
for (const T * it = coeffs; it != coeffs_end; ++it)
{
p += *it * std::pow(x, std::distance(it, coeffs_end) - 1);
}