My medianfilter.cpp class invokes qsort as seen below.
vector<float> medianfilter::computeMedian(vector<float> v) {
float arr[100];
std::copy(v.begin(), v.end(), arr);
unsigned int i;
qsort(arr, v.size(), sizeof(float), compare);
for (i = 0; i < v.size(); i++) {
printf("%f ", arr[i]);
}
printf("median=%d ", arr[v.size() / 2]);
return v;
}
The implementaiton of my comparison is:
int medianfilter::compare(const void * a, const void * b) {
float fa = *(const float*) a;
float fb = *(const float*) b;
return (fa > fb) - (fa < fb);
}
while the declaration in mediafilter.hpp is set private and looks like that:
int compare (const void*, const void*);
A compilation error occurs: cannot convert ‘mediafilter::compare’ from type ‘int (mediafilter::)(const void*, const void*)’ to type ‘__compar_fn_t {aka int (*)(const void*, const void*)}’
I don't understand this error completly. How do I correctly declare and implement this comparison method?
Thanks!
Compare is a non-static member function whereas qsort expects a non-member function (or a static member function). As your compare function doesn't seem to use any non-static members of the class, you could just declare it static. In fact I'm not sure what your median filter class does at all. Perhaps you just need a namespace.
Why not sort the vector directly instead of copying it into a second array? Furthermore your code will break if the vector has more than 100 elements.
The default behavior of sort does just want you need, but for completeness I show how to use a compare function.
I also changed the return type of your function because I don't understand why a function called computeMedian wouldn't return the median..
namespace medianfilter
{
bool compare(float fa, float fb)
{
return fa < fb;
}
float computeMedian(vector<float> v)
{
std::sort(v.begin(), v.end(), compare);
// or simply: std::sort(v.begin(), v.end());
for (size_t i = 0; i < v.size(); i++) {
printf("%f ", v[i]);
}
if (v.empty())
{
// what do you want to happen here?
}
else
{
float median = v[v.size() / 2]; // what should happen if size is odd?
printf("median=%f ", median); // it was %d before
return median;
}
}
}
You can't call compare as it is because it is a member function and requires a this pointer (i.e. it needs to be called on an object). However, as your compare function doesn't need a this pointer, simply make it a static function and your code will compile.
Declare it like this in your class:
static int compare(const void * a, const void * b);
Not directly related to your question (for which you already have the answer) but some observations:
Your calculation of median is wrong. If the number of elements is even you should return the average of the two center values not the value of lower one.
The copy to the array with a set size screams buffer overflow. Copy to another vector and std:sort it or (as suggested by #NeilKirk) just sort the original one unless you have cause not to modify it.
There is no guard against empty input. Median is undefined in this case but your implementation would just return whatever happens to be on arr[0]
Ok, this is more of an appendix to Eli Algranti (excellent) answer than an answer to the original question.
Here is a generic code to compute the quantile quant of a vector of double called x (which the code below preserves).
First things first: there are many definitions of quantiles (R alone lists 9). The code below corresponds to definition #5 (which is also the default quantile function in matlab and generally the ones statisticians think of when they think quantile).
The key idea here is that when the quantile do not fall on a precise observation (e.g. when you want the 15% quantile of an array of length 10) the implementation below realizes the (correct) interpolation (in this case between the 10% and 20%) between adjacent quantile. This is important so that when you increase the number of observations (i m hinting at the name medianfilter here) the value of the quantile do not jump about abruptly but converges smoothly instead (which is one reason why this is the statistician's preferred definition).
The code assumes that x has at least one element (the code below is part of a longer one and I feel this point has been made already).
Unfortunately it s written using many function from the (excellent!) c++ eigen library and it is too late for me at this advanced time in the night to translate the eigen functions --or sanitize the variable names--, but the key ideas should be readable.
#include <Eigen/Dense>
#include <Eigen/QR>
using namespace std;
using namespace Eigen;
using Eigen::MatrixXd;
using Eigen::VectorXd;
using Eigen::VectorXi;
double quantiles(const Ref<const VectorXd>& x,const double quant){
//computes the quantile 'quant' of x.
const int n=x.size();
double lq,uq,fq;
const double q1=n*(double)quant+0.5;
const int index1=floor(q1);
const int index2=ceil(q1);
const double index3=(double)index2-q1;
VectorXd x1=x;
std::nth_element(x1.data(),x1.data()+index1-1,x1.data()+x1.size());
lq=x1(index1-1);
if(index1==index2){
fq=lq;
} else {
uq=x1.segment(index1,x1.size()-index1-1).minCoeff();
fq=lq*index3+uq*(1.0-index3);
}
return(fq);
}
So the code uses one call to nth_element, which has average complexity O(n) [sorry for sloppely using big O for average] and (when n is even) one extra call to min() [which in eigen dialect is noted .minCoeff()] on at most n/2 elements of the vector, which is O(n/2).
This is much better than using partial sort (which would cost O(nlog(n/2)), worst case) or sort (which would cost
O(nlogn))
Related
I have a vector of 3D coordinates:
vector<float64>contourdata;
contourdata={x0,y0,z0,x1,y1,z1,x2,y2,z2,...}
And I want to sort them by the vector of the z value.
How can I do it in c++?
Like this :
#include <algorithm>
#include <iostream>
#include <vector>
#include <format>
// 3d points have 3 coordinates and
// we need to move those 3 values together when sorting
// It is also good to use "concepts" from real world as
// names in code : so define a struct representing a 3d coordinate.
// (Or use a 3d coordinate type from an existing library)
struct vec_3d_t
{
double x;
double y;
double z;
};
// helper function for outputing the values of one 3d point
// not essential for your problem.
std::ostream& operator<<(std::ostream& os, const vec_3d_t& data_point)
{
os << std::format("({0},{1},{2})", data_point.x, data_point.y, data_point.z);
return os;
}
int main()
{
// std::vector is a (resizable) array
// in this case to hold 3d coordinates
// then initialize the data with some values
// (you will probably get them from somewhere else, e.g. a file)
std::vector<vec_3d_t> contour_data
{
{3.0,4.0,5.0}, // first coordinate
{1.0,2.0,3.0}, // second etc ...
{7.0,8.0,9.0}
};
// this calls the sort algorithm
// using a function to compare two 3d points
// to sort on z only compare z.
std::sort(contour_data.begin(), contour_data.end(), [](const vec_3d_t& lhs, const vec_3d_t& rhs)
{
return lhs.z < rhs.z;
});
// range based for loop over data points
for (const auto& data_point : contour_data)
{
std::cout << data_point << "\n";
}
return 0;
}
I struggled to find a way to use std::sort but you can always drop back the the C qsort stdlib function:
#include <cstdlib>
int compare(const double *A, const double *B) {
return (A[2] <= B[2]) ? -1 : +1;
}
...
const size_t N = contourdata.size() / 3;
qsort(contourdata.data(), N, 3*sizeof(double),
(int (*)(const void *, const void*))compare);
qsort (and mergesort and heapsort) are sort
routines provided by the C standard library that have been around a long time. They are designed to sort data stored in contiguous arrays, but it is the programmers job to specify how the data is laid out and how to order the elements. qsort is not type-safe and generally not preferred, but can handle cases like this one. qsort has three parameters:
A ptr to the base of the array. Note that the ptr type is void * and thus the compiler has no clue about the type of data in the buffer. std::vector provides a data() method that provides a ptr to its internal buffer (which is guaranteed to be contiguous).
The number of elements. Since each element consists of 3 double's we use we use the size of the vector divided by 3.
The size of each element in bytes.
A ptr to a function used for comparing two elements. The arguments to each element are generic ptrs, but since we know they are ptrs to buffers containing doubles we can specify the type in our compare function. Each element is an array of 3 doubles and, since we are using the z-component as the sort key, we compare the doubles at offset 2. We return -1 for "less than" and +1 for "greater" -- this is enough to know how to sort.
Note that when passing the compare function to qsort we cast
it to the function ptr type that it expects to keep the
compiler from issuing a warning.
I have in a function internal,
Eigen::SparseMatrix<double> & M;
if (M.IsRowMajor)
return my_func_template<Eigen::SparseMatrix<double,Eigen::RowMajor>&>(M,M.rows());
However, this does not compile, as the compiler does not believe M is an Eigen::SparseMatrix<double,Eigen::RowMajor>. How do I cast my reference as, specifically, Eigen::SparseMatrix<double,Eigen::RowMajor>, in the type-safe environment of C++11?
For example:
typedef Eigen::SparseMatrix<double> Smat;
typedef Eigen::SparseMatrix<double,Eigen::RowMajor> RMSmat;
typedef Eigen::SparseMatrix<double,Eigen::ColMajor> CMSmat;
enum direction { row, col};
template<class Mat>
vector<double> sum_along_inner(Mat &M){
vector<double> sums(M.innerSize(),0);
for(auto i = 0; i < M.outerSize(); i++){
for(typename M::InnerIterator it(M,i); it;++it){
sums[i] += it.value();
}
}
}
vector<double> sum_along_axis(Smat &M, direction dir){
// If I could solve this problem,
//
// I could also function off these if components,
// and re-use them for other order-dependent functions I write
// so that my top level functions are only about 2-4 lines long
if(dir == direction::row){
if(M.IsRowMajor)
return sum_along_inner<RMSmat>((my question) M);
//else
RMsmat Mrowmajor = M;
return sum_along_inner<RMSmat>(Mrowmajor);
}
else {
if(!M.IsRowMajor)
return sum_along_inner<CMSmat>(M);
// else
CMSmat Mcolmajor = M;
return sum_along_inner<CMSmat>((my_question) Mcolmajor);
}
}
And if I do more than just sum_along_axis, then the code complexity in terms of number of lines, readability, etc. is double what it needs to be if only I could solve this problem that I am asking about.
Otherwise, I can't abstract the loop, and I have to repeat it for column major and row major...because I can't just assume I wont call sum_along_axis from a function that hasn't already swapped the major-order from the default Eigen::ColMajor to Eigen::RowMajor...
Further, if I am operating at the order of mb-sized sparse matrices with dimensions too unwieldy to represent in dense matrix form, I am going to notice a major slowdown (which defeats the purpose of using a sparse matrix to begin with) if I don't write composable functions which are order agnostic, and transition the major-order only when needed.
So, unless I solve for this, my line count and/or function count, more or less, starts to go combinatorial.
As I wrote in my first comment M.IsRowMajor will always be false. This is because Eigen::SparseMatrix has always two template arguments, where the second defaults to Eigen::ColMajor
If you want to write a function which accepts both row- and column-major matrices, you need to write something like
template<int mode>
vector<double> sum_along_axis(Eigen::SparseMatrix<double,mode> const &M, direction dir)
if(dir == direction::row){
return sum_along_inner<RMSmat>(M); // implicit conversion if necessary
}
else {
return sum_along_inner<CMSmat>(M); // implicit conversion if necessary
}
}
You need to rewrite sum_along_inner to accept a const reference to make the implicit conversion work:
template<class Mat>
vector<double> sum_along_inner(Mat const &M){
vector<double> sums(M.outerSize(),0); // sums needs to have size M.outerSize()
for(auto i = 0; i < M.outerSize(); i++){
for(typename M::InnerIterator it(M,i); it;++it){
sums[i] += it.value();
}
}
}
If you want to avoid the conversion from row- to column-major (and vice versa) you should write a function which sums along the outer dimension and decide in your main function which function to call.
I have two arrays comprising x,y vales for y=f(x). I would like to provide a function that finds the value of x that corresponds to either the min or max sampled value of y.
What is an efficient way to select proper comparison operator before looping over the values in the arrays?
For example, I would like to do something like the following:
double FindExtremum(const double* x, const double* y,
const unsigned int n, const bool isMin) {
static std::less<double> lt;
static std::greater<double> gt;
std::binary_function<double,double,bool>& IsBeyond = isMin ? lt : gt;
double xm(*x), ym(*y);
for (unsigned int i=0; i<n; ++i, ++x, ++y) {
if (IsBeyond()(*y,ym)) {
ym = *y;
xm = *x;
}
}
}
Unfortunately, the base class std::binary_function does not define a virtual operator().
Will a compiler like g++ 4.8 be able to optimize the most straight forward implementation?
double FindExtremum(const double* x, const double* y,
const unsigned int n, const bool isMin) {
double xm(*x), ym(*y);
for (unsigned int i=0; i<n; ++i, ++x, ++y) {
if ( ( isMin && (*y<ym)) ||
(!isMin && (*y>ym)) ) {
ym = *y;
xm = *x;
}
}
}
Is there another way to arrange things to make it easy for the compiler to optimize? Is there a well known algorithm for doing this?
I would prefer to avoid using a templated function, if possible.
You would need to pass the comparison functor as a templated function parameter, e.g.
template <typename Compare>
double FindExtremum(const double* x, const double* y,
const unsigned int n, Compare compare) {
double xm(*x), ym(*y);
for (unsigned int i=0; i<n; ++i, ++x, ++y) {
if (compare(*y,ym)) {
ym = *y;
xm = *x;
}
}
}
Then if you need runtime choice, write something like this:
if (isMin) {
FindExtremum(x, y, n, std::less<double>());
} else {
FindExtremum(x, y, n, std::greater<double>());
}
Avoiding a templated function is not really possible in this case. The best performing code will be one that embeds the comparison operation directly in the loop, avoiding a function call - you can either write a template or write two copies of this function. A templated function is clearly the better solution.
For ultimate efficiency, make the comparison operator or the comparison operator choice a template parameter, and don't forget to measure.
When striving for utmost micro-efficiency, doing virtual calls is not in the direction of the goal.
That said, this is most likely a case of premature optimization, which Donald Knuth described thusly:
“Premature optimization is the root of all evil”
(I omitted his reservations, it sounds more forceful that way. :-) )
Instead of engaging in micro-optimization frenzy, which gains you little if anything, and wastes your time, I recommend more productively trying to make the code as clear and provably correct as possible. For example, use std::vector instead of raw arrays and separately passed sizes. And, for example, don't call the boolean comparison operator compare, as recommended in another answer, since that's the conventional name for tri-valued compare (e.g. as in std::string::compare).
Some questions arise here. First, I think you're overcomplicating the situation. For example, it would be easier to have two functions, one that calculates the min and other that calculates the max, and then call either of them depending on the value of isMin.
More so, note how in each iteration you're making the test to see if isMin is true or not, (at least in the "optimized" code you show last) and that comparison could have been done just once.
Now, if isMin can be deduced in any way at compile time, you can use a template class that selects the correct implementation optimized for the case, and without any run-time overhead (not tested, written from memory):
template<bool isMin>
class ExtremeFinder
{
static float FindExtreme(const double* x, const double* y,
const unsigned int n)
{
// Version that calculates when isMin is false
}
};
template<>
class ExtremeFinder<true>
static float FindExtreme(const double* x, const double* y,
const unsigned int n)
{
// Version that calculates when isMin is true
}
};
and call it as ExtremeFinder<test_to_know_isMin>::FindExtreme(...);, or, if you cannot decide it at compile time, you can always do:
if (isMin_should_be_true)
ExtremeFinder<true>::FindExtreme(...);
else
ExtremeFinder<false>::FindExtreme(...);
If you had 2 disjunct criteria, e.g. < and >=, you could use a bool less function argument and use XOR in loop:
if (less ^ (a>=b))
Don't know about performance, but is easy to write.
Or not-covering-all-possibilities-disjunct < and >:
if ( (a!=b) && (less ^ (a>b) )
Here is my problem:
I have a struct:
struct point
{
int x;
int y;
};
and then I have an array:
for (int i = 0;i < n;i++)
{
arr[i].x=rand() % n + 1;
}
I defined the quicksort function as follows:
void quicksort(int *a, int left, int right);
and I want to sort the point by X coordinate, so I call the quicksort:
quicksort(arr.x, 0, n-1);
And this is the error message:
error: request for member 'x' in 'arr', which is of non-class type 'point [(((unsigned int)(((int)n) + -0x000000001)) + 1)]'
Sorry if the question is too stupid or badly formulated, the truth is I'm a newbie and I'm really willing to learn as much as possible and I'd be very thankful for your help!
If you always want to sort by x, then you can hard-code it into the sort function, and just pass a pointer to the array to sort:
void quicksort(point * arr, int left, int right) {
// test points with
// if (arr[i].x < arr[j].x) {/* i sorts before j */}
}
quicksort(arr, 0, n-1);
To specify a class member to sort by, you need a pointer-to-member, not a pointer; something like:
void quicksort(point * arr, int point::*member, int left, int right){
// test points with
// if (arr[i].*member < arr[j].*member) {/* i sorts before j */}
}
quicksort(arr, &point::x, 0, n-1);
More generically, you could follow the example of std::sort and accept any comparison functor:
template <typename RandIter, typename Compare>
void quicksort(RandIter begin, RandIter end, Compare compare) {
// test points with
// if (compare(*it1, *it2)) {/* *it1 sorts before *it2 */}
}
quicksort(arr, arr+n,
[](point const &lhs, point const &rhs) {return lhs.x < rhs.x;});
And of course, unless you're learning how to implement a sorting algorithm, just use std::sort.
quicksort(arr,0,n-1);
then within quicksort, try to compare the arr[i].x
There are a few problems with your code.
1. quicksort accepts int* but you try to pass int value x
2. You try to pass int but you actually call an undefined variable arr.x
What you need to do is either call in the form of &arr[i].x, but to accomplish what you want, you probably want to pass the entire struct as a pointer.
You need to pass arr as the parameter, as that is the array to be sorted. arr.x is meaningless. You are not passing the string "arr.x" as a parameter which can somehow be interpreted as meaning sort on the x field - when the compiler sees this, it is looking for an x element of arr, which doesn't exist, as the error message suggests - only the elements of arr (e.g. arr[0]) have x elements (accessed as arr[0].x).
Assuming this is for academic purposes (why else would you declare your own sorting algorithm instead of using one of the ones already implemented with a custom comparator?), you can do this a few ways:
Array
std::array<point, 10> myArray; // declares an array of size 10 for points
template<size_t N>
void quicksort(std::array<point, N>& arr, ...)
{
// implement sort operating on arr
}
Vector
std::vector<point> myVector; // declares a dynamic array/vector of points
void quicksort(std::vector<point>& arr, ...)
{
// implement sort operating on arr
}
If for some god-awful reason, you want to keep it in C:
Legacy
const size_t SIZE = 10;
point arr[SIZE]; // declare an array of 10 points
void quicksort(point* p, const size_t n, ...)
{
// implement sort operating on elements in p passing in SIZE for n
}
I'd rather defined the function as:
void quicksort(void *a,int left,int right, size_t size, int (*fp)(void*,void*));
size is the size of one element of array and fp is a compare function which returns true if the two arguments are equal. Now you can pass the call the function as:
quicksort(arr,0,n-1,sizeof(arr)/sizeof(arr[0]), compare);
where function compare is something like:
int compare(void* a, void* b) { return *((int*)a) >= *((int*)b); }
Rest of the implementation of function is trivial I think.
(almost) never try to fool the system by passing a pointer to a member when you really want to pass a pointer to an object. Do as Grijesh suggested. Passing a member can lead to horrible side effects. For example, quicksort is going to sort all the integers together, regardless of which of them are X's and which are Y's. In milder cases you may get wrong compare criteria, and often hard to debug effects such as incorrect pointer optimization. Just be honest with the compiler and pass the object pointer if you need to pass an object pointer. There are very very very few exceptions, mostly to do with low-level system programming where the "other side' of the function call won't be able to handle the object.
I'm wondering how to properly use pointers in for and while loops in C++. Usually I write using C instead of C++. The only reason I'm using the C++ std library this time is so I can use the complex number functions required by other mathematical functions in the code.
As part of the assignment we were given the following function declaration. The part that I wrote is commented within the function.
typedef std::complex<double> complex;
// Evaluates a polynomial using Horner's approach.
// Inputs:
// [coeffs, coeffs_end) - polynomial coefficients, ordered by descending power
// x - point of evaluation
// Outputs:
// p - value of polynomial at x
// dp - value of polynomial derivative at x
// ddp - value of polynomials second derivative at x
//
template<typename T>
inline void poly_val(T const* coeffs, T const* coeffs_end, T x, T & p, T & dp, T & ddp)
{
//MY CODE HERE
int i = 0;
const T *pnt = coeffs;
while(pnt != coeffs_end){
//Evaluate coefficients for descending powers
p += coeffs(i)*pow(x,((coeffs_end-1)-i));
pnt++;
i++;
}
}
The function doesn't know the length of the array, so I'm guessing the stop condition is the pointer 'coeffs_end', which points to the last value in the array 'coeffs'. Can I use a pointer in a conditional this way? (traditionally I would have fed the length of the array into the function, but we cant modify the declarations)
If I do it this way I keep get an error when compiling (which I don't get):
C2064:term foes not evaluate to a function taking 1 arguments
for the following line:
p += coeffs(i)*pow(x,((coeffs_end-1)-i));
coeffs(i) is calling convention to a function that takes an integer argument. But in your case it is an pointer. So, you need to use [] operator to access the element at it's index.
Also ((coeffs_end-1)-i) resolves to an address location. You need to dereference it to get the value at the location.
Maybe it'd be more readable to write this in a cleaner fashion:
#include <cmath>
#include <iterator>
template<typename T>
inline void poly_val(T const* coeffs, T const* coeffs_end, T x, T & p, T & dp, T & ddp)
{
const std::size_t nterms = std::distance(coeffs, coeffs_end);
for (std::size_t i = 0; i != nterms; ++i)
{
p += coeffs[i] * std::pow(x, nterms - 1 - i);
}
}
Since raw pointers can be treated as iterators, we can use std::distance to determine the size of an array bounded by a range [first, last).
Edit: Acutally it can be done even easier:
for (const T * it = coeffs; it != coeffs_end; ++it)
{
p += *it * std::pow(x, std::distance(it, coeffs_end) - 1);
}