I'm trying to dertermine if the time overhead introduced by boost::function to evaluate mathematical functions is negligeable versus using function templates.
The code for the benchmark I use is bellow.
With traditionnal g++, overhead with boost::function is negligeable :
$ g++ -O3 main.cxx
$ ./a.out
METHOD INTEGRAL TIME TO COMPUTE (SEC)
Direct 0.379885 3.360000
Function template 0.379885 3.380000
Boost function 0.379885 3.400000
With llvm-g++, there a speed gain of factor 1.5 for templates function, but no gain for boost::function.
$ llvm-g++ -O3 main.cxx
METHOD INTEGRAL TIME TO COMPUTE (SEC)
Direct 0.379885 2.170000
Function template 0.379885 2.160000
Boost function 0.379885 3.360000
Is it possible to obtain the 1.5 gain for boost::function and llvm-g++?
#include <boost/function.hpp>
#include <math.h>
#include <stdio.h>
typedef unsigned int UInt;
using namespace std;
//=============================================================================
// chrono
//=============================================================================
class Chrono
{
clock_t t1_,t2_,dt_;
public:
Chrono(){}
void start() { t1_=clock(); };
void stop() { t2_=clock(); };
double diff() { return ( (double)( t2_ - t1_) ) / CLOCKS_PER_SEC; };
};
//=============================================================================
// function to integrate
//=============================================================================
inline double fct(double x)
{
return 1. / (1.+exp(x));
}
//=============================================================================
// using direct method
//=============================================================================
double direct(double a, double b, UInt numSamplePoints)
{
double delta = (b-a) / (numSamplePoints-1);
double sum = 0.;
for (UInt i=0; i < numSamplePoints-1; ++i)
sum += 1. / (1. + exp(a + i*delta));
return sum * delta;
}
//=============================================================================
// using function template
//=============================================================================
template<double functionToIntegrate(double)>
double integrate(double a, double b, UInt numSamplePoints)
{
double delta = (b-a) / (numSamplePoints-1);
double sum = 0.;
for (UInt i=0; i < numSamplePoints-1; ++i)
sum += functionToIntegrate(a + i*delta);
return sum * delta;
}
//=============================================================================
// using Boost function
//=============================================================================
typedef boost::function<double ( double )> fct_type;
class IntegratorBoost {
public:
fct_type functionToIntegrate;
IntegratorBoost(fct_type fct): functionToIntegrate(fct){}
double integrate(double a, double b, UInt numSamplePoints)
{
double delta = (b-a) / (numSamplePoints-1);
double sum = 0.;
for (UInt i=0; i < numSamplePoints-1; ++i)
sum += functionToIntegrate(a + i*delta);
return sum * (b-a) / numSamplePoints;
}
};
//=============================================================================
// main
//=============================================================================
int main()
{
double integral;
UInt numSamplePoints = 5E07;
Chrono chrono;
printf("%-20s%-10s%-30s\n","METHOD","INTEGRAL","TIME TO COMPUTE (SEC)");
// Direct
chrono.start();
integral = direct(0., 1., numSamplePoints);
chrono.stop();
printf("%-20s%-10f%-30f\n","Direct",integral,chrono.diff());
// Function template
chrono.start();
integral = integrate<fct>(0., 1.,numSamplePoints);
chrono.stop();
printf("%-20s%-10f%-30f\n","Function template",integral,chrono.diff());
// Boost function
chrono.start();
IntegratorBoost intboost(fct);
integral = intboost.integrate(0.,1.,numSamplePoints);
chrono.stop();
printf("%-20s%-10f%-30f\n","Boost function",integral,chrono.diff());
}
Without actually measure, I am going to venture and claim that using boost::function (or std::function from C++11) cannot be as efficient as the other two options.
The reason is that function uses type erasure to remove the type of the actual functor being used, and that implies that function needs to store the actual object that makes the call through by pointer and use function calls. On the other hand, in the other two methods, the compiler is able to inline the logic and remove the cost of dispatch.
This is actually quite similar to the many times mentioned difference in performance of C library qsort compared to C++ sort, where by using a functor the compiler has better chances for inlining and optimizing.
A different question then, is whether this will have an impact on your application, and for that you need to measure. It might be the case that overall the cost of IO, or any other operation dominates your application and this won't make a difference at all.
Related
When trying to use a conditional copy_if algorithm to copy only the values that are lower than mean of values in a vector into another vector, I hit a snag with my function object:
struct Lower_than_mean
{
private:
double mean;
vector<double>d1;
public:
Lower_than_mean(vector<double>a)
:d1{a}
{
double sum = accumulate(d1.begin(), d1.end(), 0.0);
mean = sum / (d1.size());
}
bool operator()(double& x)
{
return x < mean;
}
};
int main()
{
vector<double>vd{ 3.4,5.6, 7, 3,4,5.6,9,2 };
vector<double>vd2(vd.size());
copy_if(vd.begin(), vd.end(), vd2, Lower_than_mean(vd));
}
What is the right way of going about this?
You used vd instead of vd.begin() in the call to std::copy_if.
But, seriously, you didn't bother to even read your compiler output...
Also, like #zch suggests - your approach doesn't make sense: Don't keep re-calculating the mean again and again. Instead, calculate it once, and then your function becomes as simple [mean](double x) { return x < mean; } lambda.
I have to decide whether to use template vs virtual-inheritance.
In my situation, the trade-off make it really hard to choose.
Finally, it boiled down to "How much virtual-calling is really cost (CPU)?"
I found very few resources that dare to measure the vtable cost in actual number e.g. https://stackoverflow.com/a/158644, which point to page 26 of http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf.
Here is an excerpt from it:-
However, this overhead (of virtual) is on the order of 20% and 12% – far less than
the variability between compilers.
Before relying on the fact, I have decided to test it myself.
My test code is a little long (~ 40 lines), you can also see it in the links in action.
The number is ratio of time that virtual-calling used divided by normal-calling.
Unexpectedly, the result is contradict to what open-std stated.
http://coliru.stacked-crooked.com/a/d4d161464e83933f : 1.58
http://rextester.com/GEZMC77067 (with custom -O2): 1.89
http://ideone.com/nmblnK : 2.79
My own desktop computer (Visual C++, -O2) : around 1.5
Here is it :-
#include <iostream>
#include <chrono>
#include <vector>
using namespace std;
class B2{
public: int randomNumber=((double) rand() / (RAND_MAX))*10;
virtual ~B2() = default;
virtual int f(int n){return -n+randomNumber;}
int g(int n){return -n+randomNumber;}
};
class C : public B2{
public: int f(int n) override {return n-randomNumber;}
};
int main() {
std::vector<B2*> bs;
const int numTest=1000000;
for(int n=0;n<numTest;n++){
if(((double) rand() / (RAND_MAX))>0.5){
bs.push_back(new B2());
}else{
bs.push_back(new C());
}
};
auto t1 = std::chrono::system_clock::now();
int s=0;
for(int n=0;n<numTest;n++){
s+=bs[n]->f(n);
};
auto t2= std::chrono::system_clock::now();
for(int n=0;n<numTest;n++){
s+=bs[n]->g(n);
};
auto t3= std::chrono::system_clock::now();
auto t21=t2-t1;
auto t32=t3-t2;
std::cout<<t21.count()<<" "<<t32.count()<<" ratio="<< (((float)t21.count())/t32.count()) << std::endl;
std::cout<<s<<std::endl;
for(int n=0;n<numTest;n++){
delete bs[n];
};
}
Question
Is it what to be expect that virtual calling is at least +50% slower than normal calling?
Did I test it in a wrong-way?
I have also read :-
AI Applications in C++: How costly are virtual functions? What are the possible optimizations?
Virtual functions and performance - C++
this is a personal project I've been working on and I can't figure out what's going on here (just learning C++). I found answers to very similar problems, but I can't seem to execute the solution. Here is my code with some of the unimportant bits trimmed out:
#include <iostream>
#include <cmath>
#include <complex>
#include <boost/array.hpp>
#include <boost/numeric/odeint.hpp>
#include <gsl/gsl_roots.h>
class Riemann
{
public:
// constructor
Riemann(double leftP, double rightP, double leftrho, double rightrho, \
double leftvx, double rightvx, double leftvy, double rightvy, double gam);
double PL,PR,rhoL,rhoR,vxL,vxR,vyL,vyR,gamma;
// function prototypes
double shockvelocity(double Pg, int sign);
double rarefactionvelocity(double Pg, int sign);
void RfODE(const boost::array<double,6> &vrhovt, \
boost::array<double,6> &dvrhovtdp, double t);
// ~Riemann();
};
Riemann::Riemann(double leftP, double rightP, double leftrho, double rightrho, \
double leftvx, double rightvx, double leftvy, double rightvy, double gam){
// constructs Riemann public variables
}
double Riemann::shockvelocity(double Pg,int sign){
// calculate a shock velocity, not important here...
}
void Riemann::RfODE(const boost::array<double,6> &vrhovt, \
boost::array<double,6> &dvrhovtdp, double t){
// calculates the ODE I want to solve
}
double Riemann::rarefactionvelocity(double Pg, int sign){
double dpsize=0.00001;
double P,rho,vx,vy,vtest;
//
boost::array<double,6> vrhovt = {vx,rho,vy,double(sign),P,gamma}; // initial conditions
boost::numeric::odeint::integrate(std::bind(&Riemann::RfODE,std::ref(*this),std::placeholders::_1,
std::placeholders::_2, std::placeholders::_3),vrhovt,P,Pg,dpsize);
std::cout<<"vRarefaction="<<vrhovt[0]<<std::endl;
return vrhovt[0];
}
double FRiemann(double Pg, void* Riemannvalues){
Riemann* Rvals = (Riemann*)Riemannvalues;
// calls on Riemann::rarefactionvelocity at some point
}
int main(){
double PL= 1000.0;
double PR= 0.01;
double rhoL= 1.0;
double rhoR= 1.0;
double vxL= 0.0;
double vxR= 0.0;
double vyL= 0.0;
double vyR= 0.0;
double gam = 5.0/3.0;
// calls FRiemann to get a root
}
What's happening is the code is going through, calling Riemann::rarefactionvelocity just fine, but for some reason RfODE is never executed (ex. print statements in this function never execute) and the value for vrhovt[0] returned is of course the value it began with, vx. No compiler errors, either (using gcc 4.8.1 and -std=c++11 and -O2 tags) This is very strange because I've tested the rarefaction-specific functions on their own (outside of the Riemann class) and they work -- the problem seems to be that they're in this class. Given how Riemann solvers work, though, I had my reasons for making a class out of these functions and really would like to find a way to make this work without doing a massive rewrite and changing the class structure.
Any help is much appreciated! Thank you! : )
It might be possible that P is not initialized correctly. At least I don't see it in your code. P needs to be smaller than PG otherwise your are already behind your the end of the integration.
Also, don't use bind, use a lambda instead. I think bind is somehow obsolete in C++11/C++14. It might be possible that bind doesn't get the references correct.
double Riemann::rarefactionvelocity(double Pg, int sign)
{
// ...
// not tested
using namspace boost::numeric::odeint;
integrate( [this](auto const& x, auto &dxdt ,auto t ) {
this->RfODE(x, dt, t); } ,vrhovt,P,Pg,dpsize);
}
Say I have a C++ function that looks like this:
double myfunction(double a, double b) {
// do something
}
Which I then call like this:
double a = 1.0;
double b = 2.0;
double good_r = myfunction(a, b);
double bad_r = myfunction(b, a); // compiles fine
I would like to make sure that a and b are never provided in the wrong order.
What is the best way to ensure this in C++?
Other languages allow named parameters, like this:
double good_r = myfunction(a=a, b=b);
double bad_r = myfunction(a=b, b=a); // mistake immediately obvious
double bad_r = myfunction(b=b, a=a); // compiles fine
Or perhaps the problem can be partly solved using types, i.e.
double my_type_safe_function(a_type a, b_type b) {
// do something
}
a_type a = 1.0;
b_type b = 2.0;
double good_r = myfunction(a, b);
double bad_r = myfunction(b, a); // compilation error
EDIT: A couple of people have asked what I mean by the "wrong order." What I mean is that, in real code a and b have some significance. For example, the arguments might instead be height and width. The difference between them is very important for the function to return the correct result. However, they are both floats and they both have the same dimensions (i.e. a length). Also, there is no "obvious" order for them. The person writing the function declaration may assume (width, height) and the person using the function may assume (height, width). I would like a way to ensure this doesn't happen by mistake. With two parameters it is easy to be careful with the order, but in a large project and with up to 6 arguments mistakes creep in.
Ideally I would like the checks to be done at compile time, and for there to be no performance hit (i.e. at the end of the day they are treated as plain old floats or whatever).
How about this:
struct typeAB {float a; float b; };
double myfunction(typeAB p) {
// do something
return p.a - p.b;
}
int main()
{
typeAB param;
param.a = 1.0;
param.b = 2.0;
float result = myfunction(param);
return 0;
}
Of course, you can still mess up when you assign your parameter(s) but that risk is hard to avoid :)
A variant is to have one struct per "new" type, and then make them go away in optimized builds using macros.
Something along these lines (only slightly tested, so it could be way off):
#define SAFE 0
#if SAFE
#define NEWTYPE(name, type) \
struct name { \
type x; \
explicit name(type x_) : x(x_) {}\
operator type() const { return x; }\
}
#else
#define NEWTYPE(name, type) typedef type name
#endif
NEWTYPE(Width, double);
NEWTYPE(Height, double);
double area(Width w, Height h)
{
return w * h;
}
int main()
{
cout << area(Width(10), Height(20)) << endl;
// This line says 'Could not convert from Height to Width' in g++ if SAFE is on.
cout << area(Height(10), Width(20)) << endl;
}
I think you already provided the easiest solution, using types.
One alternative could be using a builder class and method chaining.
Like:
class MyfunctionBuilder {
MyFunctionBuilder & paramA(double value);
MyFunctionBuilder & paramB(double value);
double execute();
(...)
}
Which you would use like this:
double good_r = MyFunctionBuilder().paramA(a).paramB(b).execute();
But this is a lot of extra code to write!
What is the "wrong order" actually? In this example of yours
double myfunction(double a, double b) {
// do something
}
double a = 1.0;
double b = 2.0;
double good_r = myfunction(a, b);
double bad_r = myfunction(b, a);
how do you actually want to know if this is the right order? What if the variables would be named "quapr" and "moo" instead of "a" and "b"? Then it would be impossible to guess whether the order is right or wrong just by looking at them.
With this in mind, you can do at least two things. First, is to give meaningfull names to the arguments, e.g.
float getTax( float price, float taxPercentage )
instead of
float getTax( float a, float b )
Second, do the necessary checks inside:
float divide( float dividend, float divisor )
{
if( divisor == 0 )
{
throw "omg!";
}
}
It is possible to do more complex checks, such as making a functor, and setting it's parameters explicitly, but in most of the cases that just complicates things without much benefit.
I have a class called Universe. The class includes a member function to calculate distance and requires numerically integrating an ugly looking function. I was trying to use GSL to perform the integration but I get the following error when I attempt to compile the library -
$ g++ -c -O3 -std=c++11 Universe.cpp -o Universe.o
$ error: cannot convert ‘Universe::Hz’ from type ‘double (Universe::)(double, void*)’ to type ‘double (*)(double, void*)’
Here's the class Universe without the constructors (for brevity):
Universe.h
#ifndef UNIVERSE_H
#define UNIVERSE_H
#include <cmath>
#include <gsl/gsl_integration.h>
using namespace std;
class Universe {
private:
static constexpr double c = 299792458.0, Mpc2Km = 3.08567758e+19, Yrs2Sec = 3.15569e7;
double H0 = 67.77, OmegaM = (0.022161+0.11889)/(H0*H0), OmegaL = 0.6914, OmegaG = 8.24e-5, OmegaK = 0.0009;
double Ez(double z);
double Hz(double z, void* params);
public:
double distH, timeH;
Universe() = default;
Universe(double h0);
Universe(double omegaM, double omegaL);
Universe(double h0, double omegaM, double omegaL);
Universe(double omegaM, double omegaL, double omegaG, double omegaK);
Universe(double h0, double omegaM, double omegaL, double omegaG, double omegaK);
//double radius();
//double age();
double distC(double z);
};
#endif
Universe.cpp
#include <cmath>
#include <gsl/gsl_integration.h>
#include "Universe.h"
using namespace std;
double Universe::Hz(double z, void* params) {
double result = 1.0/pow(OmegaL + pow(1.0+z,3.0)*OmegaM + pow(1.0+z,4.0)*OmegaG + pow(1.0+z,2.0)*OmegaK, 0.5);
return result;
}
double Universe::distC(double z) {
double lower_limit = 0.0, abs_error = 1.0e-8, rel_error = 1.0e-8, alpha = 0.0, result, error;
gsl_integration_workspace *work_ptr = gsl_integration_workspace_alloc(1000);
gsl_function Hz_function;
void* params_ptr = α
Hz_function.function = Universe::Hz;
Hz_function.params = params_ptr;
gsl_integration_qags(&Hz_function, lower_limit, z, abs_error, rel_error, 1000, work_ptr, &result, &error);
return distH*result;
}
I don't quite know how to troubleshoot this problem and I'm using GSL for the first time based on the documentation at:
http://www.gnu.org/software/gsl/manual/html_node/Numerical-integration-examples.html
and the following guide:
http://www.physics.ohio-state.edu/~ntg/780/gsl_examples/qags_test.cpp
Thank you for looking and any answers!
Try the following: Make your Hz function static, like so:
static double Hz(double z, void* params)
I verified that this works with your code.
I'm not an expert, but I believe (hand-wavy explanation follows) the basic problem is that, since the gsl_function structure needs a pointer to a function (i.e. double *), the compiler's not happy with using a class method from an "object" which hasn't even been instantiated. Making the function static in the class, however, means the compiler knows where the function can be found even before an instance of the class exists, and can deal with using a pointer to that function. (Maybe someone can give a better explanation than that, but hopefully I'm not too far off track here.)
Hope this helps.
what happens when you change the methode name of HZ?