Why doesn't g++ fully optimize these loops / operator calls? - c++

Consider this struct which can for example represent a structure of 2 4D vectors:
struct A {
double x[4];
double y[4];
A() : A(0.0, 0.0) { }
A(double xp, double yp)
std::fill_n(x, 4, xp);
std::fill_n(y, 4, yp);
// Simple element-wise delegation of the mathematical operations
friend A operator+(const A &l, const A &r)
A res;
for (int i = 0; i < 4; i++)
res.x[i] = l.x[i] + r.x[i];
res.y[i] = l.y[i] + r.y[i];
return res;
friend A operator*(const A &l, const double &r)
A res;
for (int i = 0; i < 4; i++)
res.x[i] = l.x[i] * r;
res.y[i] = l.y[i] * r;
return res;
friend A operator*(const double &l, const A &r)
A res;
for (int i = 0; i < 4; i++)
res.x[i] = l * r.x[i];
res.y[i] = l * r.y[i];
return res;
friend std::ostream &operator<<(std::ostream &stream, const A &a)
for (int i = 0; i < 4; i++)
std::cout << "(" << a.x[i] << "|" << a.y[i] << ") ";
return stream;
For convenience, the struct has a few operators defined, which simply delegate to member- and element-wise operations.
Now, consider two different versions of a second struct B, that contains objects of A:
struct B { // version 1
double f1;
double f2; // Two coefficients
A buff1;
A buff2;
A buffa[4]; // Objects of struct A
// The following functions use the operators defined on struct A
void mathA(int i, double d) // Some math operations
buff2 = buff1 + buffa[i] * d;
void mathB() // Some more math (vector) operations
buff1 = f1 * (buffa[0] + buffa[3]) + f2 * (buffa[1] + buffa[2]);
struct B { // version 2
double f1;
double f2; // Two coefficients
A buff1;
A buff2;
A buffa[4]; // Objects of struct A
// The following functions DO NOT use the operators defined on struct A
void mathA(int i, double d) // Some math operations
for (int j = 0; j < 4; j++)
buff2.x[j] = buff1.x[j] + buffa[i].x[j] * d;
buff2.y[j] = buff1.y[j] + buffa[i].y[j] * d;
void mathB() // Some more math (vector) operations
for (int j = 0; j < 4; j++)
buff1.x[j] = f1 * (buffa[0].x[j] + buffa[3].x[j]) + f2 * (buffa[1].x[j] + buffa[2].x[j]);
buff1.y[j] = f1 * (buffa[0].y[j] + buffa[3].y[j]) + f2 * (buffa[1].y[j] + buffa[2].y[j]);
As you can see, the secont version of struct B performs the same mathematical operations, but the first version uses the operators of struct A while the second performs these operations manually in mathA and mathB. Note that the second version of struct B does not actually use the operators defined in struct A.
Let's add a main function to test the functionality of struct B (Diff window, "Left"):
int main(int argc, char **argv)
B b;
b.f1 = 0.5;
b.f2 = 0.8;
b.buff1 = A(0.7, 0.8);
b.buff2 = A(1.7, 2.8);
b.mathA(1, 0.9);
std::cout << b.buff1 << "\n" << b.buff2;
I have prepared examples of both cases in godbolt here. Both cases are compiled using g++ 7.1.0 on optimization level -O3. The left case corresponds to version 1, the right case to version 2 of struct B.
As you can see in the disassembly, the compiler generates two labels for version 1, which correspond to the mathX functions in struct B:
64 B::mathA(int, double):
76 B::mathB():
As my analysis shows, the first example is much slower compared to the second example. The functions are called more than 1 billion times in my actual code and are thus very much contributing to the overall runtime. I assume this is partially due to the jumps to the function definitions.
Is there a way to force the compiler to produce an assembly that is identical to the second example? I.e. with using the definitions of the operators?
Since the compiler seemed to generate labels and jumps for mathX(…) my idea was to attempt to inline these functions. Using the inline keyword changed nothing, but for g++ you can use __attribute__((always_inline)) which will force the compiler to inline the function (documentation):
struct B { // version 3
// …
mathA(int i, double d) __attribute__((always_inline))
// …
mathB() __attribute__((always_inline))
// …
This improved the performance, which is now somewhere between version 1 and version 2. This is still not perfect, but if no better solution will be found, I will go with this one.


How do I implement the numerical differentiation (f'(x) = f(x+h)-f(x)/ h

2nd task:
For a function f : R^n → R the gradient at a point ~x ∈ R^n is to be calculated:
- Implement a function
CMyVector gradient(CMyVector x, double (*function)(CMyVector x)),
which is given in the first parameter the location ~x and in the second parameter the function f as function pointer in the second parameter, and which calculates the gradient ~g = grad f(~x) numerically
gi = f(x1, . . . , xi-1, xi + h, xi+1 . . . , xn) - f(x1, . . . , xn)/h
to fixed h = 10^-8.
My currently written program:
#pragma once
#include <vector>
#include <math.h>
class CMyVektor
/* data */
int Dimension = 0;
//Public Method
void set_Dimension(int Dimension /* Aktuelle Dim*/);
void set_specified_Value(int index, int Value);
double get_specified_Value(int key);
int get_Vector_Dimension();
int get_length_Vektor();
double& operator [](int index);
string umwandlung()
CMyVektor::CMyVektor(/* args */)
Vector.resize(0, 0);
for (size_t i = 0; i < Vector.size(); i++)
delete Vector[i];
void CMyVektor::set_Dimension(int Dimension /* Aktuelle Dim*/)
void CMyVektor::set_specified_Value(int index, int Value)
if (Vector.empty())
else {
Vector[index] = Value;
double CMyVektor::get_specified_Value(int key)
// vom intervall anfang - ende des Vectors
for (unsigned i = 0; i < Vector.size(); i++)
if (Vector[i] == key) {
return Vector[i];
int CMyVektor::get_Vector_Dimension()
return Vector.size();
// Berechnet den Betrag "länge" eines Vectors.
int CMyVektor::get_length_Vektor()
int length = 0;
for (size_t i = 0; i < Vector.size(); i++)
length += Vector[i]^2
return sqrt(length);
// [] Operator überladen
double& CMyVektor::operator [](int index)
return Vector[index];
#include <iostream>
#include "ClassVektor.h"
using namespace std;
CMyVektor operator+(CMyVektor a, CMyVektor b);
CMyVektor operator*(double lambda, CMyVektor a);
CMyVektor gradient(CMyVektor x, double (*funktion)(CMyVektor x));
int main() {
CMyVektor V1;
CMyVektor V2;
CMyVektor C;
C= V1 + V2;
std::cout << "Addition : "<< "(";;
for (int i = 0; i < C.get_length_Vector(); i++)
std::cout << C[i] << " ";
std::cout << ")" << endl;
C = lamda * C;
std::cout << "Skalarprodukt: "<< C[0]<< " ";
// Vector Addition
CMyVektor operator+(CMyVektor a, CMyVektor b)
int ai = 0, bi = 0;
int counter = 0;
CMyVektor c;
// Wenn Dimension Gleich dann addition
if (a.get_length_Vector() == b.get_length_Vector())
while (counter < a.get_length_Vector())
c[counter] = a[ai] + b[bi];
return c;
//Berechnet das Skalarprodukt
CMyVektor operator*(double lambda, CMyVektor a)
CMyVektor c;
for (unsigned i = 0; i < a.get_length_Vector(); i++)
c[0] += lambda * a[i];
return c;
* Differenzenquotient : (F(x0+h)+F'(x0)) / h
* Erster Parameter die Stelle X - Zweiter Parameter die Funktion
* Bestimmt numerisch den Gradienten.
CMyVektor gradient(CMyVektor x, double (*funktion)(CMyVektor x))
My problem now is that I don't quite know how to deal with the
CMyVector gradient(CMyVector x, double (*function)(CMyVector x))
function and how to define a function that corresponds to it.
I hope that it is enough information. Many thanks.
The function parameter is the f in the difference formula. It takes a CMyVector parameter x and returns a double value. You need to supply a function parameter name. I'll assume func for now.
I don't see a parameter for h. Are you going to pass a single small value into the gradient function or assume a constant?
The parameter x is a vector. Will you add a constant h to each element?
This function specification is a mess.
Function returns a double. How do you plan to turn that into a vector?
No wonder you're confused. I am.
Are you trying to do something like this?
You are given a function signature
CMyVector gradient(CMyVector x, double (*function)(CMyVector x))
Without knowing the exact definition I will assume, that at least the basic numerical vector operations are defined. That means, that the following statements compile:
CMyVector x {2.,5.,7.};
CMyVector y {1.,7.,4.};
CMyVector z {0.,0.,0.};
double a = 0.;
// vector addition and assigment
z = x + y;
// vector scalar multiplication and division
z = z * a;
z = x / 0.1;
Also we need to know the dimension of the CMyVector class. I assumed and will continue to do so that it is three dimensional.
The next step is to understand the function signature. You get two parameters. The first one denotes the point, at which you are supposed to calculate the gradient. The second is a pointer to the function f in your formula. You do not know it, but can call it on a vector from within your gradient function definition. That means, inside of the definition you can do something like
double f_at_x = function(x);
and the f_at_x will hold the value f(x) after that operation.
Armed with this, we can try to implement the formula, that you mentioned in the question title:
CMyVector gradient(CMyVector x, double (*function)(CMyVector x)) {
double h = 0.001;
// calculate first element of the gradient
CMyVector e1 {1.0, 0.0, 0.0};
double result1 = ( function(x + e1*h) - function(x) )/h;
// calculate second element of the gradient
CMyVector e2 {0.0, 1.0, 0.0};
double result2 = ( function(x + e2*h) - function(x) )/h;
// calculate third element of the gradient
CMyVector e3 {0.0, 0.0, 1.0};
double result3 = ( function(x + e3*h) - function(x) )/h;
// return the result
return CMyVector {result1, result2, result3};
There are several thing worth to mention in this code. First and most important I have chosen h = 0.001. This may like a very arbitrary choice, but the choice of the step size will very much impact the precision of your result. You can find a whole lot of discussion about that topic here. I took the same value that according to that wikipedia page a lot of handheld calculators use internally. That might not be the best choice for the floating point precision of your processor, but should be a fair one to start with.
Secondly the code looks very ugly for an advanced programmer. We are doing almost the same thing for each of the three dimensions. Ususally you would like to do that in a for loop. The exact way of how this is done depends on how the CMyVector type is defined.
Since the CMyVektor is just rewritting the valarray container, I will directly use the valarray:
#include <iostream>
#include <valarray>
using namespace std;
using CMyVektor = valarray<double>;
CMyVektor gradient(CMyVektor x, double (*funktion)(CMyVektor x));
const double h = 0.00000001;
int main()
// sum(x_i^2 + x_i)--> gradient: 2*x_i + 1
auto fun = [](CMyVektor x) {return (x*x + x).sum();};
CMyVektor d = gradient(CMyVektor{1,2,3,4,5}, fun);
for (auto i: d) cout << i<<' ';
return 0;
CMyVektor gradient(CMyVektor x, double (*funktion)(CMyVektor x)){
CMyVektor grads(x.size());
CMyVektor pos(x.size());
for (int i = 0; i<x.size(); i++){
pos[i] = 1;
grads[i] = (funktion(x + h * pos) - funktion(x))/ h;
pos[i] = 0;
return grads;
The prints out 3 5 7 9 11 which is what is expected from the given function and the given location

Pass a function as argument, without knowlegde of number of arguments of this function [duplicate]

long time browser, first time asker here. I've written a number of scripts for doing various 1D numerical integration methods and compiled them into a library. I would like that library to be as flexible as possible regarding what it is capable of integrating.
Here I include an example: a very simple trapezoidal rule example where I pass a pointer to the function to be integrated.
// Numerically integrate (*f) from a to b
// using the trapezoidal rule.
double trap(double (*f)(double), double a, double b) {
int N = 10000;
double step = (b-a)/N;
double s = 0;
for (int i=0; i<=N; i++) {
double xi = a + i*step;
if (i == 0 || i == N) { s += (*f)(xi); }
else { s += 2*(*f)(xi); }
s *= (b-a)/(2*N);
return s;
This works great for simple functions that only take one argument. Example:
double a = trap(sin,0,1);
However, sometimes I may want to integrate something that has more parameters, like a quadratic polynomial. In this example, the coefficients would be defined by the user before the integration. Example code:
// arbitrary quadratic polynomial
double quad(double A, double B, double C, double x) {
return (A*pow(x,2) + B*x + C);
Ideally, I would be able to do something like this to integrate it:
double b = trap(quad(1,2,3),0,1);
But clearly that doesn't work. I have gotten around this problem by defining a class that has the coefficients as members and the function of interest as a member function:
class Model {
double A,B,C;
Model() { A = 0; B = 0; C = 0; }
Model(double x, double y, double z) { A = x; B = y; C = z; }
double func(double x) { return (A*pow(x,2)+B*x+C); }
However, then my integration function needs to change to take an object as input instead of a function pointer:
// Numerically integrate model.func from a to b
// using the trapezoidal rule.
double trap(Model poly, double a, double b) {
int N = 10000;
double step = (b-a)/N;
double s = 0;
for (int i=0; i<=N; i++) {
double xi = a + i*step;
if (i == 0 || i == N) { s += poly.func(xi); }
else { s += 2*poly.func(xi); }
s *= (b-a)/(2*N);
return s;
This works fine, but the resulting library is not very independent, since it needs the class Model to be defined somewhere. Also, ideally the Model should be able to change from user-to-user so I wouldn't want to fix it in a header file. I have tried to use function templates and functors to get this to work but it is not very independent since again, the template should be defined in a header file (unless you want to explicitly instantiate, which I don't).
So, to sum up: is there any way I can get my integration functions to accept arbitrary 1D functions with a variable number of input parameters while still remaining independent enough that they can be compiled into a stand-alone library? Thanks in advance for the suggestions.
What you need is templates and std::bind() (or its boost::bind() counterpart if you can't afford C++11). For instance, this is what your trap() function would become:
template<typename F>
double trap(F&& f, double a, double b) {
int N = 10000;
double step = (b-a)/N;
double s = 0;
for (int i=0; i<=N; i++) {
double xi = a + i*step;
if (i == 0 || i == N) { s += f(xi); }
// ^
else { s += 2* f(xi); }
// ^
s *= (b-a)/(2*N);
return s;
Notice, that we are generalizing from function pointers and allow any type of callable objects (including a C++11 lambda, for instance) to be passed in. Therefore, the syntax for invoking the user-provided function is not *f(param) (which only works for function pointers), but just f(param).
Concerning the flexibility, let's consider two hardcoded functions (and pretend them to be meaningful):
double foo(double x)
return x * 2;
double bar(double x, double y, double z, double t)
return x + y * (z - t);
You can now provide both the first function directly in input to trap(), or the result of binding the last three arguments of the second function to some particular value (you have free choice on which arguments to bind):
#include <functional>
int main()
trap(foo, 0, 42);
trap(std::bind(bar, std::placeholders::_1, 42, 1729, 0), 0, 42);
Of course, you can get even more flexibility with lambdas:
#include <functional>
#include <iostream>
int main()
trap(foo, 0, 42);
trap(std::bind(bar, std::placeholders::_1, 42, 1729, 0), 0, 42);
int x = 1729; // Or the result of some computation...
int y = 42; // Or some particular state information...
trap([&] (double d) -> double
x += 42 * d; // Or some meaningful computation...
y = 1; // Or some meaningful operation...
return x;
}, 0, 42);
std::cout << y; // Prints 1
And you can also pass your own stateful functors tp trap(), or some callable objects wrapped in an std::function object (or boost::function if you can't afford C++11). The choice is pretty wide.
Here is a live example.
What you trying to do is to make this possible
trap( quad, 1, 2, 3, 0, 1 );
With C++11 we have alias template and variadic template
template< typename... Ts >
using custom_function_t = double (*f) ( double, Ts... );
above define a custom_function_t that take a double and variable numbers of arguments.
so your trap function becomes
template< typename... Ts >
double trap( custom_function_t<Ts...> f, Ts... args, double a, double b ) {
int N = 10000;
double step = (b-a)/N;
double s = 0;
for (int i=0; i<=N; i++) {
double xi = a + i*step;
if (i == 0 || i == N) { s += f(xi, args...); }
else { s += 2*f(xi, args...); }
s *= (b-a)/(2*N);
return s;
double foo ( double X ) {
return X;
double quad( double X, double A, double B, double C ) {
return(A*pow(x,2) + B*x + C);
int main() {
double result_foo = trap( foo, 0, 1 );
double result_quad = trap( quad, 1, 2, 3, 0, 1 ); // 1, 2, 3 == A, B, C respectively
Tested on Apple LLVM 4.2 compiler.

How to convert Biginteger to string

I have a vector with digits of number, vector represents big integer in system with base 2^32. For example:
vector <unsigned> vec = {453860625, 469837947, 3503557200, 40}
This vector represent this big integer:
base = 2 ^ 32
3233755723588593872632005090577 = 40 * base ^ 3 + 3503557200 * base ^ 2 + 469837947 * base + 453860625
How to get this decimal representation in string?
Here is an inefficient way to do what you want, get a decimal string from a vector of word values representing an integer of arbitrary size.
I would have preferred to implement this as a class, for better encapsulation and so math operators could be added, but to better comply with the question, this is just a bunch of free functions for manipulating std::vector<unsigned> objects. This does use a typedef BiType as an alias for std::vector<unsigned> however.
Functions for doing the binary division make up most of this code. Much of it duplicates what can be done with std::bitset, but for bitsets of arbitrary size, as vectors of unsigned words. If you want to improve efficiency, plug in a division algorithm which does per-word operations, instead of per-bit. Also, the division code is general-purpose, when it is only ever used to divide by 10, so you could replace it with special-purpose division code.
The code generally assumes a vector of unsigned words and also that the base is the maximum unsigned value, plus one. I left a comment wherever things would go wrong for smaller bases or bases which are not a power of 2 (binary division requires base to be a power of 2).
Also, I only tested for 1 case, the one you gave in the OP -- and this is new, unverified code, so you might want to do some more testing. If you find a problem case, I'll be happy to fix the bug here.
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
namespace bigint {
using BiType = std::vector<unsigned>;
// cmp compares a with b, returning 1:a>b, 0:a==b, -1:a<b
int cmp(const BiType& a, const BiType& b) {
const auto max_size = std::max(a.size(), b.size());
for(auto i=max_size-1; i+1; --i) {
const auto wa = i < a.size() ? a[i] : 0;
const auto wb = i < b.size() ? b[i] : 0;
if(wa != wb) { return wa > wb ? 1 : -1; }
return 0;
bool is_zero(BiType& bi) {
for(auto w : bi) { if(w) return false; }
return true;
// canonize removes leading zero words
void canonize(BiType& bi) {
const auto size = bi.size();
if(!size || bi[size-1]) return;
for(auto i=size-2; i+1; --i) {
if(bi[i]) {
bi.resize(i + 1);
// subfrom subtracts b from a, modifying a
// a >= b must be guaranteed by caller
void subfrom(BiType& a, const BiType& b) {
unsigned borrow = 0;
for(std::size_t i=0; i<b.size(); ++i) {
if(b[i] || borrow) {
// TODO: handle error if i >= a.size()
const auto w = a[i] - b[i] - borrow;
// this relies on the automatic w = w (mod base),
// assuming unsigned max is base-1
// if this is not the case, w must be set to w % base here
borrow = w >= a[i];
a[i] = w;
for(auto i=b.size(); borrow; ++i) {
// TODO: handle error if i >= a.size()
borrow = !a[i];
// a[i] must be set modulo base here too
// (this is automatic when base is unsigned max + 1)
// binary division and its helpers: these require base to be a power of 2
// hi_bit_set is base/2
// the definition assumes CHAR_BIT == 8
const auto hi_bit_set = unsigned(1) << (sizeof(unsigned) * 8 - 1);
// shift_right_1 divides bi by 2, truncating any fraction
void shift_right_1(BiType& bi) {
unsigned carry = 0;
for(auto i=bi.size()-1; i+1; --i) {
const auto next_carry = (bi[i] & 1) ? hi_bit_set : 0;
bi[i] >>= 1;
bi[i] |= carry;
carry = next_carry;
// if carry is nonzero here, 1/2 was truncated from the result
// shift_left_1 multiplies bi by 2
void shift_left_1(BiType& bi) {
unsigned carry = 0;
for(std::size_t i=0; i<bi.size(); ++i) {
const unsigned next_carry = !!(bi[i] & hi_bit_set);
bi[i] <<= 1; // assumes high bit is lost, i.e. base is unsigned max + 1
bi[i] |= carry;
carry = next_carry;
if(carry) { bi.push_back(1); }
// sets an indexed bit in bi, growing the vector when required
void set_bit_at(BiType& bi, std::size_t index, bool set=true) {
std::size_t widx = index / (sizeof(unsigned) * 8);
std::size_t bidx = index % (sizeof(unsigned) * 8);
if(bi.size() < widx + 1) { bi.resize(widx + 1); }
if(set) { bi[widx] |= unsigned(1) << bidx; }
else { bi[widx] &= ~(unsigned(1) << bidx); }
// divide divides n by d, returning the result and leaving the remainder in n
// this is implemented using binary division
BiType divide(BiType& n, BiType d) {
if(is_zero(d)) {
// TODO: handle divide by zero
return {};
std::size_t shift = 0;
while(cmp(n, d) == 1) {
BiType result;
do {
if(cmp(n, d) >= 0) {
set_bit_at(result, shift);
subfrom(n, d);
} while(shift--);
return result;
std::string get_decimal(BiType bi) {
std::string dec_string;
// repeat division by 10, using the remainder as a decimal digit
// this will build a string with digits in reverse order, so
// before returning, it will be reversed to correct this.
do {
const auto next_bi = divide(bi, {10});
const char digit_value = static_cast<char>(bi.size() ? bi[0] : 0);
dec_string.push_back('0' + digit_value);
bi = next_bi;
} while(!is_zero(bi));
std::reverse(dec_string.begin(), dec_string.end());
return dec_string;
int main() {
bigint::BiType my_big_int = {453860625, 469837947, 3503557200, 40};
auto dec_string = bigint::get_decimal(my_big_int);
std::cout << dec_string << '\n';

Do I need to check the pointer before using noalias in Eigen(c++)

Assume there are three matrix a,b,c
a and c share the same buffer but with different name
should do some check like
if(a.data() == c.data()){
a = b * c;
a.noalias() = b * c;
Or I could just write a = b + c?
Edit : full example
#include <Eigen/Dense>
#include <iostream>
using namespace Eigen;
template<typename Derived>
void sigmoid(MatrixBase<Derived> const &input, MatrixBase<Derived> const &weight,
MatrixBase<Derived> &output)
output = weight * input;
output= 1.0 / (1.0 + (-1.0 * output.array()).exp());
int main()
MatrixXd weight = MatrixXd::Random(2, 2);
MatrixXd input = MatrixXd::Random(2, 2);
MatrixXd activation;
for(size_t i = 0; i != 2; ++i){
MatrixBase<MatrixXd> const &Temp =
i == 0 ? input : activation;
sigmoid(Temp , weight, activation);
The example already simplify, the case is, when i == 0, the Temp should be input, when it is not, it should be activation.
Your underlying assumption that the data pointers will match is incorrect. Just to prove that, try this:
Eigen::MatrixXd aa = Eigen::MatrixXd::Random(5,5);
Eigen::Map<Eigen::MatrixXd> gg(aa.data()+5, 4, 5);
std::cout << aa.data() << "\n";
std::cout << gg.data() << "\n";
So, you would have to either know at compile time if they share the same buffer or not (or think of a better test). With the limited example shown, I think you'll have to write a = b * c just to make sure a and c don't overlap.

C++: pass function with arbitrary number of parameters as a parameter

long time browser, first time asker here. I've written a number of scripts for doing various 1D numerical integration methods and compiled them into a library. I would like that library to be as flexible as possible regarding what it is capable of integrating.
Here I include an example: a very simple trapezoidal rule example where I pass a pointer to the function to be integrated.
// Numerically integrate (*f) from a to b
// using the trapezoidal rule.
double trap(double (*f)(double), double a, double b) {
int N = 10000;
double step = (b-a)/N;
double s = 0;
for (int i=0; i<=N; i++) {
double xi = a + i*step;
if (i == 0 || i == N) { s += (*f)(xi); }
else { s += 2*(*f)(xi); }
s *= (b-a)/(2*N);
return s;
This works great for simple functions that only take one argument. Example:
double a = trap(sin,0,1);
However, sometimes I may want to integrate something that has more parameters, like a quadratic polynomial. In this example, the coefficients would be defined by the user before the integration. Example code:
// arbitrary quadratic polynomial
double quad(double A, double B, double C, double x) {
return (A*pow(x,2) + B*x + C);
Ideally, I would be able to do something like this to integrate it:
double b = trap(quad(1,2,3),0,1);
But clearly that doesn't work. I have gotten around this problem by defining a class that has the coefficients as members and the function of interest as a member function:
class Model {
double A,B,C;
Model() { A = 0; B = 0; C = 0; }
Model(double x, double y, double z) { A = x; B = y; C = z; }
double func(double x) { return (A*pow(x,2)+B*x+C); }
However, then my integration function needs to change to take an object as input instead of a function pointer:
// Numerically integrate model.func from a to b
// using the trapezoidal rule.
double trap(Model poly, double a, double b) {
int N = 10000;
double step = (b-a)/N;
double s = 0;
for (int i=0; i<=N; i++) {
double xi = a + i*step;
if (i == 0 || i == N) { s += poly.func(xi); }
else { s += 2*poly.func(xi); }
s *= (b-a)/(2*N);
return s;
This works fine, but the resulting library is not very independent, since it needs the class Model to be defined somewhere. Also, ideally the Model should be able to change from user-to-user so I wouldn't want to fix it in a header file. I have tried to use function templates and functors to get this to work but it is not very independent since again, the template should be defined in a header file (unless you want to explicitly instantiate, which I don't).
So, to sum up: is there any way I can get my integration functions to accept arbitrary 1D functions with a variable number of input parameters while still remaining independent enough that they can be compiled into a stand-alone library? Thanks in advance for the suggestions.
What you need is templates and std::bind() (or its boost::bind() counterpart if you can't afford C++11). For instance, this is what your trap() function would become:
template<typename F>
double trap(F&& f, double a, double b) {
int N = 10000;
double step = (b-a)/N;
double s = 0;
for (int i=0; i<=N; i++) {
double xi = a + i*step;
if (i == 0 || i == N) { s += f(xi); }
// ^
else { s += 2* f(xi); }
// ^
s *= (b-a)/(2*N);
return s;
Notice, that we are generalizing from function pointers and allow any type of callable objects (including a C++11 lambda, for instance) to be passed in. Therefore, the syntax for invoking the user-provided function is not *f(param) (which only works for function pointers), but just f(param).
Concerning the flexibility, let's consider two hardcoded functions (and pretend them to be meaningful):
double foo(double x)
return x * 2;
double bar(double x, double y, double z, double t)
return x + y * (z - t);
You can now provide both the first function directly in input to trap(), or the result of binding the last three arguments of the second function to some particular value (you have free choice on which arguments to bind):
#include <functional>
int main()
trap(foo, 0, 42);
trap(std::bind(bar, std::placeholders::_1, 42, 1729, 0), 0, 42);
Of course, you can get even more flexibility with lambdas:
#include <functional>
#include <iostream>
int main()
trap(foo, 0, 42);
trap(std::bind(bar, std::placeholders::_1, 42, 1729, 0), 0, 42);
int x = 1729; // Or the result of some computation...
int y = 42; // Or some particular state information...
trap([&] (double d) -> double
x += 42 * d; // Or some meaningful computation...
y = 1; // Or some meaningful operation...
return x;
}, 0, 42);
std::cout << y; // Prints 1
And you can also pass your own stateful functors tp trap(), or some callable objects wrapped in an std::function object (or boost::function if you can't afford C++11). The choice is pretty wide.
Here is a live example.
What you trying to do is to make this possible
trap( quad, 1, 2, 3, 0, 1 );
With C++11 we have alias template and variadic template
template< typename... Ts >
using custom_function_t = double (*f) ( double, Ts... );
above define a custom_function_t that take a double and variable numbers of arguments.
so your trap function becomes
template< typename... Ts >
double trap( custom_function_t<Ts...> f, Ts... args, double a, double b ) {
int N = 10000;
double step = (b-a)/N;
double s = 0;
for (int i=0; i<=N; i++) {
double xi = a + i*step;
if (i == 0 || i == N) { s += f(xi, args...); }
else { s += 2*f(xi, args...); }
s *= (b-a)/(2*N);
return s;
double foo ( double X ) {
return X;
double quad( double X, double A, double B, double C ) {
return(A*pow(x,2) + B*x + C);
int main() {
double result_foo = trap( foo, 0, 1 );
double result_quad = trap( quad, 1, 2, 3, 0, 1 ); // 1, 2, 3 == A, B, C respectively
Tested on Apple LLVM 4.2 compiler.