I am new to C++ and I am using the Eigen library. I was wondering if there was a way to sum certain elements in a vector. For example, say I have a vector that is a 100 by 1 and I just want to sum the first 10 elements. Is there a way of doing that using the Eigen library?
What I am trying to do is this: say I have a vector that is 1000 by 1 and I want to take the mean of the first 10 elements, then the next 10 elements, and so on and store that in some vector. Hence I will have a vector of size 100 of the averages. Any thoughts or suggestions are greatly appreciated.
Here is the beginning steps I have in my code. I have a S_temp4vector that is 1000 by 1. Now I intialize a new vector S_A that I want to have as the vector of the means. Here is my messy sloppy code so far: (Note that my question resides in the crudeMonteCarlo function)
#include <iostream>
#include <cmath>
#include <math.h>
#include <Eigen/Dense>
#include <Eigen/Geometry>
#include <random>
#include <time.h>
using namespace Eigen;
using namespace std;
void crudeMonteCarlo(int N,double K, double r, double S0, double sigma, double T, int n);
VectorXd time_vector(double min, double max, int n);
VectorXd call_payoff(VectorXd S, double K);
int main(){
int N = 100;
double K = 100;
double r = 0.2;
double S0 = 100;
double sigma = 0.4;
double T = 0.1;
int n = 10;
return 0;
VectorXd time_vector(double min, double max, int n){
VectorXd m(n + 1);
double delta = (max-min)/n;
for(int i = 0; i <= n; i++){
m(i) = min + i*delta;
return m;
MatrixXd generateGaussianNoise(int M, int N){
MatrixXd Z(M,N);
static random_device rd;
static mt19937 e2(time(0));
normal_distribution<double> dist(0.0, 1.0);
for(int i = 0; i < M; i++){
for(int j = 0; j < N; j++){
Z(i,j) = dist(e2);
return Z;
VectorXd call_payoff(VectorXd S, double K){
VectorXd C(S.size());
for(int i = 0; i < S.size(); i++){
if(S(i) - K > 0){
C(i) = S(i) - K;
C(i) = 0.0;
return C;
void crudeMonteCarlo(int N,double K, double r, double S0, double sigma, double T, int n){
// Create time vector
VectorXd tt = time_vector(0.0,T,n);
VectorXd t(n);
double dt = T/n;
for(int i = 0; i < n; i++){
t(i) = tt(i+1);
// Generate standard normal Z matrix
//MatrixXd Z = generateGaussianNoise(N,n);
// Generate the log normal stock process N times to get S_A for crude Monte Carlo
MatrixXd SS(N,n+1);
MatrixXd Z = generateGaussianNoise(N,n);
for(int i = 0; i < N; i++){
SS(i,0) = S0;
for(int j = 1; j <= n; j++){
SS(i,j) = SS(i,j-1)*exp((double) (r - pow(sigma,2.0))*dt + sigma*sqrt(dt)*(double)Z(i,j-1));
// This long bit of code gives me my S_A.....
Map<RowVectorXd> S_temp1(SS.data(), SS.size());
VectorXd S_temp2(S_temp1.size());
for(int i = 0; i < S_temp2.size(); i++){
S_temp2(i) = S_temp1(i);
VectorXd S_temp3(S_temp2.size() - N);
int count = 0;
for(int i = N; i < S_temp2.size(); i++){
S_temp3(count) = S_temp2(i);
VectorXd S_temp4(S_temp3.size());
for(int i = 0; i < S_temp4.size(); i++){
S_temp4(i) = S_temp3(i);
VectorXd S_A(N);
S_A(0) = (S_temp4(0) + S_temp4(1) + S_temp4(2) + S_temp4(3) + S_temp4(4) + S_temp4(5) + S_temp4(6) + S_temp4(7) + S_temp4(8) + S_temp4(9))/(n);
S_A(1) = (S_temp4(10) + S_temp4(11) + S_temp4(12) + S_temp4(13) + S_temp4(14) + S_temp4(15) + S_temp4(16) + S_temp4(17) + S_temp4(18) + S_temp4(19))/(n);
int count1 = 0;
for(int i = 0; i < S_temp4.size(); i++){
S_A(count1) =
// Calculate payoff of Asian option
//VectorXd call_fun = call_payoff(S_A,K);
This question includes a lot of code, which makes it hard to understand the question you're trying to ask. Consider including only the code specific to your question.
In any case, you can use Eigen directly to do all of these things quite simply. In Eigen, Vectors are just matrices with 1 column, so all of the reasoning here is directly applicable to what you've written.
const Eigen::Matrix<double, 100, 1> v = Eigen::Matrix<double, 100, 1>::Random();
const int num_rows = 10;
const int num_cols = 1;
const int starting_row = 0;
const int starting_col = 0;
const double sum_of_first_ten = v.block(starting_row, starting_col, num_rows, num_cols).sum();
const double mean_of_first_ten = sum_of_first_ten / num_rows;
In summary: You can use .block to get a block object, .sum() to sum that block, and then conventional division to get the mean.
You can reshape the input using Map and then do all sub-summations at once without any loop:
VectorXd A(1000); // input
Map<MatrixXd> B(A.data(), 10, A.size()/10); // reshaped version, no copy
VectorXd res = B.colwise().mean(); // partial reduction, you can also use .sum(), .minCoeff(), etc.
The Eigen documentation at https://eigen.tuxfamily.org/dox/group__TutorialBlockOperations.html says an Eigen block is a rectangular part of a matrix or array accessed by matrix.block(i,j,p,q) where i and j are the starting values (eg 0 and 0) and p and q are the block size (eg 10 and 1). Presumably you would then iterate i in steps of 10, and use std::accumulate or perhaps an explicit summation to find the mean of matrix.block(i,0,10,1).
This question might be long and I really appreciate your patience. The core problem is I used matlab and c++ to implement an optimization algorithm but they provided me different results(matlab's better).
I am recently studying some evolutionary algorithms and interested in one variant of PSO(Particle Swarm Optimization), which is called Competitive Swarm Optimizer(born in 2015). This is the paper link http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6819057.
The basic idea of this algorithm is to first generate some random particles in searching space and assign them random velocities. At each iteration, we randomly pair them and let every pair of particles compare their objective function values. Winners(with better objective values) keep status quo while losers update themselves by learning from winners(moving toward winners).
Suppose at iteration t, particle i and j are compared and i is better. Then we update particle j for iteration t+1 by following these formulas. If particle j is out of searching space, we simply pull it back to the boundary. R_1, R_2, R_3 are all random vectors uniformly drawn from [0, 1]; operation 'otimes' means elementwise product; phi is a parameter; x_bar is the center of swarm.
For example, suppose now I want to minimize a 500-d Schwefel function(minimize the maximal absolute element) and I use 250 particles, set phi=0.1, searching space is 500-d [-100, 100]. Matlab could return me something around 35 while C++ got stuck at 85 to 90. I cannot figure out what's the problem.
Let me attach my matlab and c++ code here.
Sch = #(x)max(abs(x))
lb = -100 * ones(1, 500);
ub = 100 * ones(1, 500);
swarmsize = 250;
phi = 0.1;
maxiter = 10000;
cso(Sch, lb, ub, swarmsize, phi, maxiter);
function [minf, minx] = cso(obj_fun, lb, ub, swarmsize, phi, maxiter)
assert(length(lb) == length(ub), 'Not equal length of bounds');
if all(ub - lb <= 0) > 0
error('Error. \n Upper bound must be greater than lower bound.')
vhigh = abs(ub - lb);
vlow = -vhigh;
S = swarmsize;
D = length(ub);
x = rand(S, D);
x = bsxfun(#plus, lb, bsxfun(#times, ub-lb, x)); % randomly initalize all particles
v = zeros([S D]); % set initial velocities to 0
iter = 0;
pairnum_1 = floor(S / 2);
losers = 1:S;
fx = arrayfun(#(K) obj_fun(x(K, :)), 1:S);
randperm_index = randperm(S);
while iter <= maxiter
fx(losers) = arrayfun(#(K) obj_fun(x(K, :)), losers);
swarm_center = mean(x); % calculate center all particles
randperm_index = randperm(S); % randomly permuate all particle indexes
rpairs = [randperm_index(1:pairnum_1); randperm_index(S-pairnum_1+1:S)]'; % random pair
cmask= (fx(rpairs(:, 1)) > fx(rpairs(:, 2)))';
losers = bsxfun(#times, cmask, rpairs(:, 1)) + bsxfun(#times, ~cmask, rpairs(:, 2)); % losers who with larger values
winners = bsxfun(#times, ~cmask, rpairs(:, 1)) + bsxfun(#times, cmask, rpairs(:, 2)); % winners who with smaller values
R1 = rand(pairnum_1, D);
R2 = rand(pairnum_1, D);
R3 = rand(pairnum_1, D);
v(losers, :) = bsxfun(#times, R1, v(losers, :)) + bsxfun(#times, R2, x(winners, :) - x(losers, :)) + phi * bsxfun(#times, R3, bsxfun(#minus, swarm_center, x(losers, :)));
x(losers, :) = x(losers, :) + v(losers, :);
maskl = bsxfun(#lt, x(losers, :), lb);
masku = bsxfun(#gt, x(losers, :), ub);
mask = bsxfun(#lt, x(losers, :), lb) | bsxfun(#gt, x(losers, :), ub);
x(losers, :) = bsxfun(#times, ~mask, x(losers, :)) + bsxfun(#times, lb, maskl) + bsxfun(#times, ub, masku);
iter = iter + 1;
fprintf('Iter: %d\n', iter);
fprintf('Best fitness: %e\n', min(fx));
fprintf('Best fitness: %e\n', min(fx));
[minf, min_index] = min(fx);
minx = x(min_index, :);
(I didn't write C++ function.)
#include <cstring>
#include <iostream>
#include <cmath>
#include <algorithm>
#include <ctime>
#include <iomanip>
#include <time.h>
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#define rand_01 ((double) rand() / RAND_MAX) // generate 0~1 random numbers
#define PI 3.14159265359
const int numofdims = 500; // problem dimension
const int numofparticles = 250; // number of particles
const int halfswarm = numofparticles / 2;
const double phi = 0.1;
const int maxiter = 10000; // iteration number
double Sch(double X[], int d); // max(abs(x_i))
using namespace std;
int main(){
clock_t t1,t2;
srand(time(0)); // random seed
double** X = new double*[numofparticles]; // X for storing all particles
for(int i=0; i<numofparticles; i++)
X[i] = new double[numofdims];
double** V = new double*[numofparticles]; // V for storing velocities
for(int i=0; i<numofparticles; i++)
V[i] = new double[numofdims];
double Xmin[numofdims] = {0}; // lower bounds
double Xmax[numofdims] = {0}; // upper bounds
double* fitnesses = new double[numofparticles]; // objective function values
for(int j=0; j<numofdims; j++)
Xmin[j] = -100;
Xmax[j] = 100;
for(int i=0; i<numofparticles; i++)
for(int j=0; j<numofdims; j++)
X[i][j] = Xmin[j] + rand_01 * (Xmax[j] - Xmin[j]); // initialize X
V[i][j] = 0; // initialize V
for(int i=0; i<numofparticles; i++)
fitnesses[i] = Sch(X[i], numofdims); //
double minfit = fitnesses[0]; // temporary minimal value
int minidx = 0; // temporary index of minimal value
int* idxofparticles = new int[numofparticles];
for(int i=0; i<numofparticles; i++)
idxofparticles[i] = i;
double* Xmean = new double[numofdims];
int* losers = new int[halfswarm]; // for saving losers indexes
for(int iter=0; iter<maxiter; iter++)
random_shuffle(idxofparticles, idxofparticles+numofparticles);
for(int j=0; j<numofdims; j++)
for(int i=0; i<numofparticles; i++)
Xmean[j] += X[i][j];
Xmean[j] = (double) Xmean[j] / numofparticles; // calculate swarm center
for(int i = 0; i < halfswarm; i++)
// indexes are now random
// compare 1st to (halfswarm+1)th, 2nd to (halfswarm+2)th, ...
if(fitnesses[idxofparticles[i]] < fitnesses[idxofparticles[i+halfswarm]])
losers[i] = idxofparticles[i+halfswarm];
for(int j = 0; j < numofdims; j++)
V[idxofparticles[i+halfswarm]][j] = rand_01 * V[idxofparticles[i+halfswarm]][j] + rand_01 * (X[idxofparticles[i]][j] - X[idxofparticles[i+halfswarm]][j]) + rand_01 * phi * (Xmean[j] - X[idxofparticles[i+halfswarm]][j]);
X[idxofparticles[i+halfswarm]][j] = min(max((X[idxofparticles[i+halfswarm]][j] + V[idxofparticles[i+halfswarm]][j]), Xmin[j]), Xmax[j]);
losers[i] = idxofparticles[i];
for(int j = 0; j < numofdims; j++)
V[idxofparticles[i]][j] = rand_01 * V[idxofparticles[i]][j] + rand_01 * (X[idxofparticles[i+halfswarm]][j] - X[idxofparticles[i]][j]) + rand_01 * phi * (Xmean[j] - X[idxofparticles[i]][j]);
X[idxofparticles[i]][j] = min(max((X[idxofparticles[i]][j] + V[idxofparticles[i]][j]), Xmin[j]), Xmax[j]);
// recalculate particles' values
for(int i=0; i<numofparticles; i++)
fitnesses[i] = Sch(X[i], numofdims);
if(fitnesses[i] < minfit)
minfit = fitnesses[i]; // update minimum
minidx = i; // update index
if(iter % 1000 == 0)
cout << scientific << endl;
cout << minfit << endl;
cout << scientific << endl;
cout << minfit << endl;
delete [] X;
delete [] V;
delete [] fitnesses;
delete [] idxofparticles;
delete [] Xmean;
delete [] losers;
float diff ((float)t2-(float)t1);
float seconds = diff / CLOCKS_PER_SEC;
cout << "runtime: " << seconds << "s" <<endl;
return 0;
double Sch(double X[], int d)
double result=abs(X[0]);
for(int j=0; j<d; j++)
if(abs(X[j]) > result)
result = abs(X[j]);
return result;
So, finally, why can't my c++ code reproduce matlab's outcome? Thank you very much.
This is the original MATLAB implementation
function[m, p] = max2(im)
[m1, k1] = max(im);
[m, k2] = max(m1);
x = k2;
y = k1(k2);
p = [y, x];
It is being used inside this functionality
for r = 2.^linspace(log2(minR),log2(maxR),numSteps);
itestSeek = imresize(itestBase,minR/r);
icorr = normxcorr2(cc,itestSeek);
[m,p] = max2(icorr); //here
if (m>bestm)
bestp = p*r;
bests = ccSize*r;
bestm = m;
Here is my OpenCV 3.0.0/ c++ implementation
void Utilities::Max2(cv::Mat input_image, double& m, std::vector<int>& p)
std::vector<double> m1(input_image.cols); // the local maximum for each column
std::vector<int> k1(input_image.cols); // the index of the local maximum
for (int c = 0; c < input_image.cols; ++c)
float temp_max = input_image.at<float>(0, c);
int temp_index = 0;
for (int r = 0; r < input_image.rows; ++r)
if (temp_max < input_image.at<float>(r, c))
temp_max = input_image.at<float>(r, c);
temp_index = r;
m1[c] = temp_max;
k1[c] = temp_index;
auto iter = std::max_element(m1.begin(), m1.end()); //max of all the local maximum;
m = *iter;
int k2 = std::distance(m1.begin(), iter);
double y = k1[k2];
c++ usage of the function
std::vector<double> best_p;
std::vector<double> best_s;
for (double i = 0; i < linspace_vector.size(); i++)
cv::Mat i_test_seek;
cv::Mat i_corr;
double r = linspace_vector[i];
double resize_factor = min_r / r; // minR/r in matlab
cv::resize(i_test_base, i_test_seek, cv::Size(), resize_factor, resize_factor, cv::INTER_CUBIC);
cv::matchTemplate(i_test_seek, cc_template, i_corr, CV_TM_CCORR_NORMED);
cv::imshow("i_corr", i_corr);
double m;
std::vector<int> p;
Utilities::Max2(i_corr, m, p);
if (m> best_m)
for (int i = 0; i < p.size(); ++i)
best_p.push_back(p[i] * r);
best_s.push_back(cc_size_height * r);
best_s.push_back(cc_size_width * r);
best_m = m;
Can you suggest a more efficient way of doing this?
I find the local maximum for each column and the index of that value.
Later I find the global maximum of all of the indices.
Can you try the following and benchmark, if the performance increases:
#include <limits>
void Utilities::Max2(cv::Mat input_image, double& m, std::vector<int>& p)
m = std::numeric_limits<double>::min;
std::pair<int, int> temp_index = 0;
for (int r = 0; r < input_image.rows; ++r)
for (int c = 0; c < input_image.cols; ++c)
if (m < input_image.at<float>(r, c))
m = input_image.at<float>(r, c);
temp_index = std::make_pair(c, r);
p[0] = temp_index.second;
p[1] = temp_index.first;
If there is a way to get the input as a vector and you can get the number col columns, for example using:
int cols = input_image.rows;
std::vector<double> v;
v.assign(input_image.datastart, input_image.dataend);
Then you can compute in just one go:
std::vector<double>::iterator iter = std::max_element(v.begin(), v.end());
double m = *iter;
int k = std::distance(v.begin(), iter);
int y = (int)k / cols;
int x = k % cols;
However, I am not sure if getting the data as a vector is an option nor the performance of convert it into a vector. Maybe you can run and see how it compares to your implementation.
The first piece of code is essentially finding the max value and its indices (both x and y) in an image to my understanding.
function[m, p] = max2(im)
[m1, k1] = max(im); %find the max value in each col
[m, k2] = max(m1); %find the max value among maxes
x = k2; %find the "row" of the max value
y = k1(k2); %and its "col"
p = [y, x];
This can be done using some iterations but iteration is almost always significantly slower than vector operations or Opencv functions.
So, if my understanding is correct, this operation can simply be done by
double minVal, maxVal;
Point minLoc, maxLoc;
minMaxLoc(im, &minVal, &maxVal, &minLoc, &maxLoc);
maxLoc.y will give the row, and maxLoc.x will give col.
update: Your Matlab code can also be simplified (which potentially will speed up too)
[mx, ind] = max(im(:));
p = [rem(ind,size(im,1)) ceil(ind/size(im,1))];
You could also try the following:
// creating a random matrix with 2 rows and 4 columns
Mat1d mat(2, 4);
double low = -7000.0; // minimum value for generating random numbers
double high = +7000.0; // maximum value for generating random numbers
randu(mat, Scalar(low), Scalar(high)); // generating random number matrix
double max_element = *std::max_element(mat.begin(),mat.end()); // get the max element in the matrix
int max_element_index = std::max_element(mat.begin(),mat.end()) - mat.begin(); // get the max_element_index from the matrix`
The max element index is a row major order value starting from 0 until number of items in the matrix, in this case 7,
cout << mat << endl;
cout << max_element << endl;
cout << max_element_index << endl;
[Referred Generate random numbers matrix in OpenCV for the code above]
I am trying to compute the Anderson-Darling test found here. I followed the steps on Wikipedia and made sure that when I calculate the average and standard deviation of the data I am testing denoted X by using MATLAB. Also, I used a function called phi for computing the standard normal CDF, I have also tested this function to make sure it is correct which it is. Now I seem to have a problem when I actually compute the A-squared (denoted in Wikipedia, I denote it as A in C++).
Here is my function I made for Anderson-Darling Test:
void Anderson_Darling(int n, double X[]){
sort(X,X + n);
// Find the mean of X
double X_avg = 0.0;
double sum = 0.0;
for(int i = 0; i < n; i++){
sum += X[i];
X_avg = ((double)sum)/n;
// Find the variance of X
double X_sig = 0.0;
for(int i = 0; i < n; i++){
X_sig += (X[i] - X_avg)*(X[i] - X_avg);
X_sig /= n;
// The values X_i are standardized to create new values Y_i
double Y[n];
for(int i = 0; i < n; i++){
Y[i] = (X[i] - X_avg)/(sqrt(X_sig));
//cout << Y[i] << endl;
// With a standard normal CDF, we calculate the Anderson_Darling Statistic
double A = 0.0;
for(int i = 0; i < n; i++){
A += -n - 1/n *(2*(i) - 1)*(log(phi(Y[i])) + log(1 - phi(Y[n+1 - i])));
cout << A << endl;
Note, I know that the formula for Anderson-Darling (A-squared) starts with i = 1 to i = n, although when I changed the index to make it work in C++, I still get the same result without changing the index.
The value I get in C++ is:
The value I should get, received in MATLAB is:
Any suggestions are greatly appreciated.
Here is my whole code:
#include <iostream>
#include <math.h>
#include <cmath>
#include <random>
#include <algorithm>
#include <chrono>
using namespace std;
double *Box_Muller(int n, double u[]);
double *Beasley_Springer_Moro(int n, double u[]);
void Anderson_Darling(int n, double X[]);
double phi(double x);
int main(){
int n = 2000;
double Mersenne[n];
random_device rd;
mt19937 e2(1);
uniform_real_distribution<double> dist(0, 1);
for(int i = 0; i < n; i++){
Mersenne[i] = dist(e2);
// Print Anderson Statistic for Mersenne 6a
double *result = new double[n];
result = Box_Muller(n,Mersenne);
return 0;
double *Box_Muller(int n, double u[]){
double *X = new double[n];
double Y[n];
double R_2[n];
double theta[n];
for(int i = 0; i < n; i++){
R_2[i] = -2.0*log(u[i]);
theta[i] = 2.0*M_PI*u[i+1];
for(int i = 0; i < n; i++){
X[i] = sqrt(-2.0*log(u[i]))*cos(2.0*M_PI*u[i+1]);
Y[i] = sqrt(-2.0*log(u[i]))*sin(2.0*M_PI*u[i+1]);
return X;
double *Beasley_Springer_Moro(int n, double u[]){
double y[n];
double r[n+1];
double *x = new double(n);
// Constants needed for algo
double a_0 = 2.50662823884; double b_0 = -8.47351093090;
double a_1 = -18.61500062529; double b_1 = 23.08336743743;
double a_2 = 41.39119773534; double b_2 = -21.06224101826;
double a_3 = -25.44106049637; double b_3 = 3.13082909833;
double c_0 = 0.3374754822726147; double c_5 = 0.0003951896511919;
double c_1 = 0.9761690190917186; double c_6 = 0.0000321767881768;
double c_2 = 0.1607979714918209; double c_7 = 0.0000002888167364;
double c_3 = 0.0276438810333863; double c_8 = 0.0000003960315187;
double c_4 = 0.0038405729373609;
// Set r and x to empty for now
for(int i = 0; i <= n; i++){
r[i] = 0.0;
x[i] = 0.0;
for(int i = 1; i <= n; i++){
y[i] = u[i] - 0.5;
if(fabs(y[i]) < 0.42){
r[i] = pow(y[i],2.0);
x[i] = y[i]*(((a_3*r[i] + a_2)*r[i] + a_1)*r[i] + a_0)/((((b_3*r[i] + b_2)*r[i] + b_1)*r[i] + b_0)*r[i] + 1);
r[i] = u[i];
if(y[i] > 0.0){
r[i] = 1.0 - u[i];
r[i] = log(-log(r[i]));
x[i] = c_0 + r[i]*(c_1 + r[i]*(c_2 + r[i]*(c_3 + r[i]*(c_4 + r[i]*(c_5 + r[i]*(c_6 + r[i]*(c_7 + r[i]*c_8)))))));
if(y[i] < 0){
x[i] = -x[i];
return x;
double phi(double x){
return 0.5 * erfc(-x * M_SQRT1_2);
void Anderson_Darling(int n, double X[]){
sort(X,X + n);
// Find the mean of X
double X_avg = 0.0;
double sum = 0.0;
for(int i = 0; i < n; i++){
sum += X[i];
X_avg = ((double)sum)/n;
// Find the variance of X
double X_sig = 0.0;
for(int i = 0; i < n; i++){
X_sig += (X[i] - X_avg)*(X[i] - X_avg);
X_sig /= (n-1);
// The values X_i are standardized to create new values Y_i
double Y[n];
for(int i = 0; i < n; i++){
Y[i] = (X[i] - X_avg)/(sqrt(X_sig));
//cout << Y[i] << endl;
// With a standard normal CDF, we calculate the Anderson_Darling Statistic
double A = -n;
for(int i = 0; i < n; i++){
A += -1.0/(double)n *(2*(i+1) - 1)*(log(phi(Y[i])) + log(1 - phi(Y[n - i])));
cout << A << endl;
Let me guess, your n was 2000. Right?
The major issue here is when you do 1/n in the last expression. 1 is an int and ao is n. When you divide 1 by n it performs integer division. Now 1 divided by any number > 1 is 0 under integer division (think if it as only keeping only integer part of the quotient. What you need to do is cast n as double by writing 1/(double)n.
Rest all should work fine.
Summary from discussions -
Indexes to Y[] should be i and n-1-i respectively.
n should not be added in the loop but only once.
Minor fixes like changing divisor to n instead of n-1 while calculating Variance.
You have integer division here:
A += -n - 1/n *(2*(i) - 1)*(log(phi(Y[i])) + log(1 - phi(Y[n+1 - i])));
1/n is zero when n > 1 - you need to change this to, e.g.: 1.0/n:
A += -n - 1.0/n *(2*(i) - 1)*(log(phi(Y[i])) + log(1 - phi(Y[n+1 - i])));
I am a graduate student at Florida State University studying financial mathematics. I am still a bit of a novice with C++ but I am trying to implement the Longstaff-Schwartz method for pricing of American options. Although, the algorithm in the journal is a bit daunting thus I am trying to convert the code that was written in Matlab and change it into C++. Essentially I am using the Matlab code as a guide.
I was referred by some stackexchange users to use the Eigen library which contains a good matrix class. Unfortunately the website here does not show me how to make my own function from the class. What I am stuck on is making a C++ function for the function in Matlab that does this:
Say t = 0:1/2:1 then in Matlab the output will be t = 0 0.500 1
So using the Eigen class I created a function called range to achieve the latter above. The function looks like this:
MatrixXd range(double min, double max, double N){
MatrixXd m(N,1);
double delta = (max-min)/N;
for(int i = 0; i < N; i++){
for(int j = 0; j < N; j++){
m(i,j) = min + i*delta;
return m;
I do not have any errors on my IDE (Ecclipse) but when I run my code and test this function I get this error message:
error: static assertion failed:
I am not sure what is wrong. Any suggestions on achieving what I am trying to do or any suggestions at all are greatly appreciated.
Taking the suggestion by Martijn Courteaux, I changed $N$ into an int now but I now receive a new error that I do not understand:
c:\mingw\include\c++\6.2.0\eigen\src/Core/Matrix.h:350:7: error: static
For sake of completeness I will post my whole code below:
#include <iostream>
#include <cmath>
#include <limits>
#include <algorithm>
#include <Eigen/Dense>
#include <Eigen/Geometry>
using namespace Eigen;
using namespace std;
double LaguerreExplicit(int R, double x); // Generates the (weighted) laguerre value
double payoff_Call(double S, double K); // Pay off of a call option
double generateGaussianNoise(double mu, double sigma); // Generates Normally distributed random numbers
double LSM(int T, double r, double sigma, double K, double S0, int N, int M, int R);
// T Expiration time
// r Riskless interest rate
// sigma Volatility
// K Strike price
// S0 Initial asset price
// N Number of time steps
// M Number of paths
// R Number of basis functions
MatrixXd range(double min, double max, int N);
int main(){
MatrixXd range(0, 1, 2);
double payoff_Call(double S, double K){
double payoff;
if((S - K) > 0)
payoff = S - K;
payoff = 0.0;
return payoff;
double LaguerreExplicit(int R, double x){
double value;
value = 1;
else if(R==1)
value = 0.5*(pow(x,2) - 4.0*x + 2);
else if(R==3)
value = (1.0/6.0)*(-1*pow(x,3) + 9*pow(x,2) - 18*x + 6);
else if(R==4)
value = (1.0/24.0)*(pow(x,4) - 16*pow(x,3) + 72*pow(x,2) - 96*x + 24);
else if(R==5)
value = (1.0/120.0)*(-1*pow(x,5) + 25*pow(x,4) - 200*pow(x,3) + 600*pow(x,2) - 600*x + 120);
else if (R==6)
value = (1.0/720.0)*(pow(x,6) - 36*pow(x,5) + 450*pow(x,4) - 2400*pow(x,3) + 5400*pow(x,2) - 4320*x + 720);
cout << "Error!, R is out of range" << endl;
value = 0;
value = exp(-0.5*x)*value; // Weighted used in Longstaff-Scwartz
return value;
double generateGaussianNoise(double mu, double sigma)
const double epsilon = std::numeric_limits<double>::min();
const double two_pi = 2.0*M_PI;
static double z0, z1;
static bool generate;
generate = !generate;
if (!generate)
return z1 * sigma + mu;
double u1, u2;
u1 = rand() * (1.0 / RAND_MAX);
u2 = rand() * (1.0 / RAND_MAX);
while ( u1 <= epsilon );
z0 = sqrt(-2.0 * log(u1)) * cos(two_pi * u2);
z1 = sqrt(-2.0 * log(u1)) * sin(two_pi * u2);
return z0 * sigma + mu;
MatrixXd range(double min, double max, int N){
MatrixXd m(N,1);
double delta = (max-min)/N;
for(int i = 0; i < N; i++){
for(int j = 0; j < N; j++){
m(i,j) = min + i*delta;
return m;
double LSM(int T, double r, double sigma, double K, double S0, int N, int M, int R){
double dt = T/N;
MatrixXd m(T,1);
return 0;
Here is the corrected function code that I fixed:
VectorXd range(double min, double max, int N){
VectorXd m(N + 1);
double delta = (max-min)/N;
for(int i = 0; i <= N; i++){
m(i) = min + i*delta;
return m;
Mistake is here:
MatrixXd range(double min, double max, double N){
MatrixXd m(N,1);
N is a double. The arguments of MatrixXd::MatrixXd(int, int) are int.
You presumably want to make N an int.
In regard to your edit:
Second mistake is here:
MatrixXd range(0, 1, 2);
in the main() function. Not sure what you are trying to do here, but that constructor is not valid. EDIT: Ah I believe I have an idea. You are trying to call your function named range. Do this like this:
MatrixXd result = range(0.0, 1.0, 2);
I just learned that there's a way to achieve some parallelization using intrinsics. I found the following code and wanted to go through it but I could understand much. I was trying make the operations be in single precision but how can I do that?
#include <stdio.h>
#include <stdlib.h>
#include <xmmintrin.h>
inline double pi_4 (int n){
int i;
__m128d mypart2,x2, b, c, one;
double *x = (double *)malloc(n*sizeof(double));
double *mypart = (double *)malloc(n*sizeof(double));
double sum = 0.0;
double dx = 1.0/n;
double x1[2] __attribute__((aligned(16)));
one = _mm_set_pd1(1.0); // set one to (1,1)
for (i = 0; i < n; i++){
x[i] = dx/2 + dx*i;
for (i = 0; i < n; i+=2){
x1[0]=x[i]; x1[1]=x[i+1];
x2 = _mm_load_pd(x1);
b = _mm_mul_pd(x2,x2);
c = _mm_add_pd(b,one);
mypart2 = _mm_div_pd(one,c);
_mm_store_pd(&mypart[i], mypart2);
for (i = 0; i < n; i++)
sum += mypart[i];
return sum*dx;
int main(){
double res;
printf("pi = %lf\n", 4*res);
return 0;
I was thinking of changing everything from double to float and call the correct intrinsic functions, for instance, instead of _mm_set_pd1 -> _mm_set_ps1. I don't know if this will make the program from double to single precision.
I tried like follows but I'm getting a segmentation fault
#include <stdio.h>
#include <stdlib.h>
#include <xmmintrin.h>
inline float pi_4 (int n){
int i;
__m128 mypart2,x2, b, c, one;
float *x = (float *)malloc(n*sizeof(float));
float *mypart = (float*)malloc(n*sizeof(float));
float sum = 0.0;
float dx = 1.0/n;
float x1[2] __attribute__((aligned(16)));
one = _mm_set_ps1(1.0); // set one to (1,1)
for (i = 0; i < n; i++){
x[i] = dx/2 + dx*i;
for (i = 0; i < n; i+=2){
x1[0]=x[i]; x1[1]=x[i+1];
x2 = _mm_load_ps(x1);
b = _mm_mul_ps(x2,x2);
c = _mm_add_ps(b,one);
mypart2 = _mm_div_ps(one,c);
_mm_store_ps(&mypart[i], mypart2);
for (i = 0; i < n; i++)
sum += mypart[i];
return sum*dx;
int main(){
float res;
printf("pi = %lf\n", 4*res);
return 0;
A few more fixes are needed:
x1 needs to be declared with 4 elements.
The second for loop needs to increment by 4 (this is what caused the segfault).
There need to be 4 assignments to the x1 array.
These changes are all because single-precision packs 4 values into a 16-byte vector register while double-precision packs only 2 values. I think that was it:
#include <stdio.h>
#include <stdlib.h>
#include <xmmintrin.h>
inline float pi_4 (int n){
int i;
__m128 mypart2,x2, b, c, one;
float *x = (float *)malloc(n*sizeof(float));
float *mypart = (float*)malloc(n*sizeof(float));
float sum = 0.0;
float dx = 1.0/n;
float x1[4] __attribute__((aligned(16)));
one = _mm_set_ps1(1.0); // set one to (1,1,1,1)
for (i = 0; i < n; i++){
x[i] = dx/2 + dx*i;
for (i = 0; i < n; i+=4){
x1[0]=x[i]; x1[1]=x[i+1];
x1[2]=x[i+2]; x1[3]=x[i+3];
x2 = _mm_load_ps(x1);
b = _mm_mul_ps(x2,x2);
c = _mm_add_ps(b,one);
mypart2 = _mm_div_ps(one,c);
_mm_store_ps(&mypart[i], mypart2);
for (i = 0; i < n; i++)
sum += mypart[i];
return sum*dx;
int main(){
float res;
printf("pi = %lf\n", 4*res);
return 0;
Drum roll...
$ ./foo
pi = 3.141597
A word on the use of malloc(). I think most implementations will return memory aligned on a 16-byte boundary as required for SSE loads and stores, but that may not be guaranteed as __m128 is not a C/C++ type (it is guaranteed to be aligned for "normal" types). It would be safer to use memalign() or posix_memalign().