Rcpp versus C - Outlier - c++

I have a general question with a specific example. I wrote a function to calculate columnwise variance of a matrix in C (using .Call interface) and C++ (using Rcpp interface). Looking at the following benchmarks I wonder:
> microbenchmark(times = 1000,
+ colVar(AB), # .Call Interface
+ colV(AB, ncol(AB), nrow(AB)), #Rcpp
+ apply(AB, 2, var)) #R
Unit: milliseconds
expr min lq mean median uq max neval
colVar(AB) 3.245000 3.350793 3.474891 3.433126 3.543796 5.110652 1000
colV(AB, ncol(AB), nrow(AB)) 4.064942 4.408336 10.215952 5.934169 6.383477 99.651530 1000
apply(AB, 2, var) 28.260730 30.740058 46.674155 31.464449 33.586160 129.343892 1000
>
In distribution and mean the C and the C++ function perform pretty similar, however when it comes to a maximum value there is a huge difference. Can anybody explain to me why? This is especially interesting since I am trying to learn C/C++ but also because I want to write more complicated functions in C/C++, where this could actually matter.
AB is a matrix with dimension 1000 x 1000, created with 1 000 000 rnorm() values.
Below you find the Codes for my C and Rcpp functions:
C (R-Level):
colVar <- function(x){
.Call("colV", x, ncol(x), nrow(x))
}
C (C-Level):
#include <R.h>
#include <Rinternals.h>
#include <math.h>
SEXP colV(SEXP y, SEXP n, SEXP r){
int *nc = INTEGER(n);
double *x = REAL(y);
int d = length(y);
int *nr = INTEGER(r);
int i, j, z;
//int d = nr * nc;
double xSq[(d)];
SEXP result;
PROTECT(result = allocVector(REALSXP, (*nc)));
memset(REAL(result), 0, (*nc) * sizeof(double));
double *colVar = REAL(result);
int fr = ((*nr) - 1);
for(z = 0; z < (d); z++){
xSq[z] = pow(x[z], 2);
}
for(i = 0; i < (*nc); i++){
double colMean = 0;
double xSm = 0;
double colMsq = 0;
for(j = 0; j < (*nr); j++){
colMean += ((x[(j + ((*nr) * i)) ]) / (*nr));
xSm += (xSq[(j + (*nr * i))]);
}
colMsq = (*nr) * (pow(colMean, 2));
colVar[i] = ((xSm - colMsq) / fr);
}
UNPROTECT(1);
return(result);
}
And the Rcpp-Function:
cppFunction(plugins = "unwindProtect",'NumericVector colV(NumericVector y, int n, int r){
int nc = n;
NumericVector x = y;
int nr = r;
int d = n * r;
int i, j, z;
// NumericVector colMean (nc);
NumericVector xSq (d);
// NumericVector colMsq (nc);
// NumericVector xSm (nc);
NumericVector colVar (nc);
int fr = ((nr) - 1);
for(z = 0; z < (d); z++){
xSq[z] = x[z] * x[z];
}
for(i = 0; i < (nc); i++){
double colMean = 0;
double xSm = 0;
double colMsq = 0;
for(j = 0; j < (nr); j++){
colMean += ((x[(j + ((nr) * i)) ]) / (nr));
xSm += (xSq[(j + (nr * i))]);
}
colMsq = (nr) * (colMean * colMean);
colVar[i] = ((xSm - colMsq) / fr);
}
return colVar;
}')
I have commented out stuff in the C++ function to make it as similar as possible to the C function.
If anybody of you can help me with my question I would be very thankful.

Related

How do I use blitz++

I am a beginner in c++. My focus of learning c++ is to do scientific computation. I want to use blitz++ library. I am trying to solve rk4 method but I am not getting the inner workings of the code(I know rk4 algorithm)
#include <blitz/array.h>
#include <iostream>
#include <stdlib.h>
#include <math.h>
using namespace blitz;
using namespace std;
# This will evaluate the slopes. say if dy/dx = y, rhs_eval will return y.
void rhs_eval(double x, Array<double, 1> y, Array<double, 1>& dydx)
{
dydx = y;
}
void rk4_fixed(double& x, Array<double, 1>& y, void (*rhs_eval)(double, Array<double, 1>, Array<double, 1>&), double h)
{
// Array y assumed to be of extent n, where n is no. of coupled equations
int n = y.extent(0);
// Declare local arrays
Array<double, 1> k1(n), k2(n), k3(n), k4(n), f(n), dydx(n);
// Zeroth intermediate step
(*rhs_eval) (x, y, dydx);
for (int j = 0; j < n; j++)
{
k1(j) = h * dydx(j);
f(j) = y(j) + k1(j) / 2.;
}
// First intermediate step
(*rhs_eval) (x + h / 2., f, dydx);
for (int j = 0; j < n; j++)
{
k2(j) = h * dydx(j);
f(j) = y(j) + k2(j) / 2.;
}
// Second intermediate step
(*rhs_eval) (x + h / 2., f, dydx);
for (int j = 0; j < n; j++)
{
k3(j) = h * dydx(j);
f(j) = y(j) + k3(j);
}
// Third intermediate step
(*rhs_eval) (x + h, f, dydx);
for (int j = 0; j < n; j++)
{
k4(j) = h * dydx(j);
}
// Actual step
for (int j = 0; j < n; j++)
{
y(j) += k1(j) / 6. + k2(j) / 3. + k3(j) / 3. + k4(j) / 6.;
}
x += h;
return; # goes back to function. evaluate y at x+h without returning anything
}
int main()
{
cout << y <<endl; # this will not work. The scope of y is limited to rk4_fixed
}
Here are my questions?
In rhs_eval x,y are just values. But dydx is pointer. So rhs_eval's output value will be assigned to y. No need to return anything. Am i correct?
What does int n = y.extent(0) do? In comment n is saying it's the number of coupled equation. What is the meaning of extent(0). what does extent do? what is that '0'? Is it the size of first element?
How do I print the value of 'y'? what is the format? I want to get the value of y from rk4 by calling it from main. then print it.
I compiled blitz++ using MSVS 2019 with cmake using these instruction--
Instruction
I got the code from here- only the function is given
Yes, change also y to be passed by reference. Pointer is with * or a pointer template, reference is with &.
Your vector has 1 dimension or extend. In general Array<T,n> is a tensor of order n, for n=2 a matrix. .extend(0) is the size of the first dimension, with a zero-based index.
This is complicated and not well documented. I mean the facilities provided by the Blitz library. You can just manually print the components. For some reason my version produces a memory error if the first print command is commented out.
#include <blitz/array.h>
#include <iostream>
#include <cstdlib>
//#include <cmath>
using namespace blitz;
using namespace std;
/* This will evaluate the slopes. say if dy/dx = y, rhs_eval will return y. */
const double sig = 10; const double rho = 28; const double bet = 8.0/3;
void lorenz(double x, Array<double, 1> & y, Array<double, 1> & dydx)
{
/* y vector = x,y,z in components */
/*
dydx[0] = sig * (y[1] - y[0]);
dydx[1] = rho * y[0] - y[1] - y[0] * y[2];
dydx[2] = y[0] * y[1] - bet * y[2];
*/
/* use the comma operator */
dydx = sig * (y[1] - y[0]), rho * y[0] - y[1] - y[0] * y[2], y[0] * y[1] - bet * y[2];
}
void rk4_fixed(double& x, Array<double, 1> & y, void (*rhs_eval)(double, Array<double, 1>&, Array<double, 1>&), double h)
{
// Array y assumed to be of extent n, where n is no. of coupled equations
int n = y.extent(0);
// Declare local arrays
Array<double, 1> k1(n), k2(n), k3(n), k4(n), f(n), dydx(n);
// Zeroth intermediate step
rhs_eval (x, y, dydx);
k1 = h * dydx; f=y+0.5*k1;
// First intermediate step
rhs_eval(x + 0.5*h, f, dydx);
k2 = h * dydx; f = y+0.5*k2;
// Second intermediate step
rhs_eval (x + 0.5*h, f, dydx);
k3 = h * dydx; f=y+k3;
// Third intermediate step
rhs_eval (x + h, f, dydx);
k4 = h * dydx;
// Actual step
y += k1 / 6. + k2 / 3. + k3 / 3. + k4 / 6.;
x += h;
return; //# goes back to function. evaluate y at x+h without returning anything
}
int main()
{
Array<double, 1> y(3);
y = 1,1,1;
cout << y << endl;
double x=0, h = 0.05;
while(x<20) {
rk4_fixed(x,y,lorenz,h);
cout << x;
for(int k =0; k<3; k++) {
cout << ", "<< y(k);
}
cout << endl;
}
return 0;
}
#include <blitz/array.h>
#include <iostream>
#include <cstdlib>
using namespace blitz;
using namespace std;
/* This will evaluate the slopes. say if dy/dx = y, rhs_eval will return y. */
const double sig = 10; const double rho = 28; const double bet = 8.0 / 3;
void lorenz(double x, Array<double, 1> y, Array<double, 1> &dydx)
{
/* y vector = x,y,z in components */
dydx(0) = sig * (y(1) - y(0));
dydx(1) = rho * y(0) - y(1) - y(0) * y(2);
dydx(2) = y(0) * y(1) - bet * y(2);
}
void rk4_fixed(double& x, Array<double, 1>& y, void (*rhs_eval)(double, Array<double, 1>, Array<double, 1> &), double h)
{
int n = y.extent(0);
Array<double, 1> k1(n), k2(n), k3(n), k4(n), f(n), dydx(n);
(*rhs_eval) (x, y, dydx);
for (int j = 0; j < n; j++)
{
k1(j) = h * dydx(j);
f(j) = y(j) + k1(j) / 2.0;
}
(*rhs_eval) (x + h / 2., f, dydx);
for (int j = 0; j < n; j++)
{
k2(j) = h * dydx(j);
f(j) = y(j) + k2(j) / 2.;
}
(*rhs_eval) (x + h / 2., f, dydx);
for (int j = 0; j < n; j++)
{
k3(j) = h * dydx(j);
f(j) = y(j) + k3(j);
}
(*rhs_eval) (x + h, f, dydx);
for (int j = 0; j < n; j++)
{
k4(j) = h * dydx(j);
}
for (int j = 0; j < n; j++)
{
y(j) += k1(j) / 6. + k2(j) / 3. + k3(j) / 3. + k4(j) / 6.;
}
x += h;
}
int main()
{
Array<double, 1> y(3);
y = 1, 1, 1;
double x = 0, h = 0.05;
Array<double, 1> dydx(3);
dydx = 0, 0, 0;
for (int i = 0; i < 10; i++)
{
rk4_fixed(x, y, &lorenz, h);
cout << x << " ,";
for (int j = 0; j < 3; j++)
{
cout << y(j)<<" ";
}
cout << endl;
}
return 0;
}
Another great thing is, it is possible to use blitz++ without any compilation. In Visual Studio 2019, expand {project name} than right click "references" and "Manage NuGet Packages" search for blitz++. download it. No added extra linking or others have to be done.

Matlab VS. C++ in matrix calculation

I am using C++ to do some matrix calculations using Armadillo library.
I tried to make it similar to the Matlab version.
But when I run the code.
While Matlab took about 2 - 3 min, C++ took about 20 min.
I searched a bit and realized that some people also asked why C++ is slower than Matlab in matrix calculations.
But I heard that C++ is way faster than Matlab. So I was wondering whether C++ is not as good as Matlab in terms of Matrix calculations in usual.
Below is just part of my entire code.
Is there any way I can speed up C++ matrix calculations?
Should I use a different library?
while (dif >= tol && it <= itmax) {
it = it + 1;
V = Vnew;
Vfuture = beta * (Ptrans(0) * Vnew.slice(0) + Ptrans(1) * Vnew.slice(1) + Ptrans(2) * Vnew.slice(2));
for (int a = 0; a < Na; a++) {
for (int b = 0; b < Nd; b++) {
for (int c = 0; c < Ny; c++) {
Mat<double> YY(Na, Nd);
YY.fill(Y(c));
Mat<double> AA(Na, Nd);
AA.fill(A(a));
Mat<double> DD(Na, Nd);
DD.fill(D(b));
Mat<double> CC = YY + AA - mg_A_v / R - (mg_D_v - (1 - delta) * DD);
Mat<double> Val = 1 / (1 - 1 / sig) * pow(pow(CC, psi) % pow(mg_D_v, 1 - psi), (1 - 1 / sig)) + Vfuture;
double max_val = Val.max();
uword maxindex_val = Val.index_max();
int index_column = maxindex_val / Na; // column
int index_row = maxindex_val - index_column * Na; // row
Vnew(a, b, c) = max_val;
maxposition_a(a, b, c) = index_row;
maxposition_d(a, b, c) = index_column;
}
}
}
// Howard improvement
for (int h = 0; h < H; h++) {
Vhoward = Vnew;
for (int i = 0; i < Na; i++) {
for (int j = 0; j < Nd; j++) {
for (int k = 0; k < Ny; k++) {
temphoward(i, j) = beta * Vhoward(maxposition_a(i, j, k), maxposition_d(i, j, k), 0) * Ptrans(0) + beta * Vhoward(maxposition_a(i, j, k), maxposition_d(i, j, k), 1) * Ptrans(1) + beta * Vhoward(maxposition_a(i, j, k), maxposition_d(i, j, k), 2) * Ptrans(2);
Vnew(i, j, k) = temphoward(i, j) + utility(Y(k) + A(i) - A(maxposition_a(i, j, k)) / R - D(maxposition_d(i, j, k)) + (1 - delta) * D(j), D(maxposition_d(i, j, k)), sig, psi);
}
}
}
}
tempdiff = abs(V - Vnew);
dif = tempdiff.max();
cout << dif << endl;
cout << it << endl;
}
And this is the part from the matlab.
while dif >= tol && it <= itmax
tic;
it = it + 1;
V = Vnew;
vFuture = beta*reshape(V,Na*Nd,Ny)*P;
for i_a = 1:Na %Loop over state variable a
for i_d = 1:Nd %Loop over state variable d
for i_y = 1:Ny %Loop over state variable y
val = reshape(Utility(Y(i_y) + A(i_a) - mg_A_v/R - (mg_D_v - (1-delta)*D(i_d)),mg_D_v),Na*Nd,1) + vFuture;
[Vnew(i_a,i_d,i_y), indpol(i_a,i_d,i_y)] = max(val);
[indpol_ap(i_a,i_d,i_y),indpol_dp(i_a,i_d,i_y)] = ind2sub([Na,Nd],indpol(i_a,i_d,i_y));
end
end
end
% Howard improvement step
for h = 1:H
Vhoward = Vnew;
for i_a = 1:Na %Loop over state variable a
for i_d = 1:Nd %Loop over state variable d
for i_y = 1:Ny %Loop over state variable y
Vnew(i_a,i_d,i_y) = Utility(Y(i_y) + A(i_a) - A(indpol_ap(i_a,i_d,i_y))/R - ...
(D(indpol_dp(i_a,i_d,i_y)) - (1-delta)*D(i_d)),D(indpol_dp(i_a,i_d,i_y))) ...
+ beta*reshape(Vhoward(indpol_ap(i_a,i_d,i_y),indpol_dp(i_a,i_d,i_y),:),1,Ny)*P;
end
end
end
end
dif = max(max(max(abs(V-Vnew))));
disp([it dif toc])
end

Implementing modular Runge-kutta 4th order method for a n-dimension system

i'm trying to make my runge-kutta 4th order code modular. I don't want to have to write and declare the code everytime I use it, but declare it in a .hpp and a .cpp file to use it separetely. But i'm having some problems. Generally I want to solve a n-dimension system of equations. For that I use two functions: one for the system of equations and another for the runge-kutta method as follows:
double F(double t, double x[], int eq)
{
// System equations
if (eq == 0) { return (x[1]); }
else if (eq == 1) { return (gama * sin(OMEGA*t) - zeta * x[1] - alpha * x[0] - beta * pow(x[0], 3) - chi * x[2]); }
else if (eq == 2) { return (-kappa * x[1] - phi * x[2]); }
else { return 0; }
}
void rk4(double &t, double x[], double step)
{
double x_temp1[sistvar], x_temp2[sistvar], x_temp3[sistvar];
double k1[sistvar], k2[sistvar], k3[sistvar], k4[sistvar];
int j;
for (j = 0; j < sistvar; j++)
{
x_temp1[j] = x[j] + 0.5*(k1[j] = step * F(t, x, j));
}
for (j = 0; j < sistvar; j++)
{
x_temp2[j] = x[j] + 0.5*(k2[j] = step * F(t + 0.5 * step, x_temp1, j));
}
for (j = 0; j < sistvar; j++)
{
x_temp3[j] = x[j] + (k3[j] = step * F(t + 0.5 * step, x_temp2, j));
}
for (j = 0; j < sistvar; j++)
{
k4[j] = step * F(t + step, x_temp3, j);
}
for (j = 0; j < sistvar; j++)
{
x[j] += (k1[j] + 2 * k2[j] + 2 * k3[j] + k4[j]) / 6.0;
}
t += step;
}
The above code works and it is validated. However it has some dependencies as it uses some global variables to work:
gama, OMEGA, zeta, alpha, beta, chi, kappa and phi are global variables that I want to read from a .txt file. I already manage to do that, however only in a single .cpp file with all code included.
Also, sistvar is the system dimension and also a global variable. I'm trying to enter it as an argument in F. But the way it is written seems to give errors as sistvar is a const and can't be changed as a variable and I can't put variables inside an array's size.
In addition, the two functions has an interdependency as when a call F inside rk4, eq number is needeed.
Could you give me tips in how to do that? I already searched and read books about this and could not find an answer for it. It is probably an easy task but i'm relatively new in c/c++ programming languages.
Thanks in advance!
* EDITED (Tried to implement using std::vector)*
double F(double t, std::vector<double> x, int eq)
{
// System Equations
if (eq == 0) { return (x[1]); }
else if (eq == 1) { return (gama * sin(OMEGA*t) - zeta * x[1] - alpha * x[0] - beta * pow(x[0], 3) - chi * x[2]); }
else if (eq == 2) { return (-kappa * x[1] - phi * x[2]); }
else { return 0; }
}
double rk4(double &t, std::vector<double> &x, double step, const int dim)
{
std::vector<double> x_temp1(dim), x_temp2(dim), x_temp3(dim);
std::vector<double> k1(dim), k2(dim), k3(dim), k4(dim);
int j;
for (j = 0; j < dim; j++) {
x_temp1[j] = x[j] + 0.5*(k1[j] = step * F(t, x, j));
}
for (j = 0; j < dim; j++) {
x_temp2[j] = x[j] + 0.5*(k2[j] = step * F(t + 0.5 * step, x_temp1, j));
}
for (j = 0; j < dim; j++) {
x_temp3[j] = x[j] + (k3[j] = step * F(t + 0.5 * step, x_temp2, j));
}
for (j = 0; j < dim; j++) {
k4[j] = step * F(t + step, x_temp3, j);
}
for (j = 0; j < dim; j++) {
x[j] += (k1[j] + 2 * k2[j] + 2 * k3[j] + k4[j]) / 6.0;
}
t += step;
for (j = 0; j < dim; j++) {
return x[j];
}
}
vector array
2.434 s | | 0.859 s
2.443 s | | 0.845 s
2.314 s | | 0.883 s
2.418 s | | 0.884 s
2.505 s | | 0.852 s
2.428 s | | 0.923 s
2.097 s | | 0.814 s
2.266 s | | 0.922 s
2.133 s | | 0.954 s
2.266 s | | 0.868 s
_______ _______
average = 2.330 s average = 0.880 s
Using vector function where the vector arithmetic is taken from Eigen3
#include <eigen3/Eigen/Dense>
using namespace Eigen;
of the same parts as discussed in the question could look like (inspired by function pointer with Eigen)
VectorXd Func(const double t, const VectorXd& x)
{ // equations for solving simple harmonic oscillator
Vector3d dxdt;
dxdt[0] = x[1];
dxdt[1] = gama * sin(OMEGA*t) - zeta * x[1] - alpha * x[0] - beta * pow(x[0], 3) - chi * x[2];
dxdt[2] = -kappa * x[1] - phi * x[2];
return dxdt;
}
MatrixXd RK4(VectorXd Func(double t, const VectorXd& y), const Ref<const VectorXd>& y0, double t, double h, int step_num)
{
MatrixXd y(y0.rows(), step_num );
VectorXd k1, k2, k3, k4;
y.col(0) = y0;
for (int i=1; i<step_num; i++){
k1 = Func(t, y.col(i-1));
k2 = Func(t+0.5*h, y.col(i-1)+0.5*h*k1);
k3 = Func(t+0.5*h, y.col(i-1)+0.5*h*k2);
k4 = Func(t+h, y.col(i-1)+h*k3);
y.col(i) = y.col(i-1) + (k1 + 2*k2 + 2*k3 + k4)*h/6;
t = t+h;
}
return y.transpose();
}
Passing a vector to a function to be filled apparently requires some higher template contemplations in Eigen.

Anderson Darling Test in C++

I am trying to compute the Anderson-Darling test found here. I followed the steps on Wikipedia and made sure that when I calculate the average and standard deviation of the data I am testing denoted X by using MATLAB. Also, I used a function called phi for computing the standard normal CDF, I have also tested this function to make sure it is correct which it is. Now I seem to have a problem when I actually compute the A-squared (denoted in Wikipedia, I denote it as A in C++).
Here is my function I made for Anderson-Darling Test:
void Anderson_Darling(int n, double X[]){
sort(X,X + n);
// Find the mean of X
double X_avg = 0.0;
double sum = 0.0;
for(int i = 0; i < n; i++){
sum += X[i];
}
X_avg = ((double)sum)/n;
// Find the variance of X
double X_sig = 0.0;
for(int i = 0; i < n; i++){
X_sig += (X[i] - X_avg)*(X[i] - X_avg);
}
X_sig /= n;
// The values X_i are standardized to create new values Y_i
double Y[n];
for(int i = 0; i < n; i++){
Y[i] = (X[i] - X_avg)/(sqrt(X_sig));
//cout << Y[i] << endl;
}
// With a standard normal CDF, we calculate the Anderson_Darling Statistic
double A = 0.0;
for(int i = 0; i < n; i++){
A += -n - 1/n *(2*(i) - 1)*(log(phi(Y[i])) + log(1 - phi(Y[n+1 - i])));
}
cout << A << endl;
}
Note, I know that the formula for Anderson-Darling (A-squared) starts with i = 1 to i = n, although when I changed the index to make it work in C++, I still get the same result without changing the index.
The value I get in C++ is:
-4e+006
The value I should get, received in MATLAB is:
0.2330
Any suggestions are greatly appreciated.
Here is my whole code:
#include <iostream>
#include <math.h>
#include <cmath>
#include <random>
#include <algorithm>
#include <chrono>
using namespace std;
double *Box_Muller(int n, double u[]);
double *Beasley_Springer_Moro(int n, double u[]);
void Anderson_Darling(int n, double X[]);
double phi(double x);
int main(){
int n = 2000;
double Mersenne[n];
random_device rd;
mt19937 e2(1);
uniform_real_distribution<double> dist(0, 1);
for(int i = 0; i < n; i++){
Mersenne[i] = dist(e2);
}
// Print Anderson Statistic for Mersenne 6a
double *result = new double[n];
result = Box_Muller(n,Mersenne);
Anderson_Darling(n,result);
return 0;
}
double *Box_Muller(int n, double u[]){
double *X = new double[n];
double Y[n];
double R_2[n];
double theta[n];
for(int i = 0; i < n; i++){
R_2[i] = -2.0*log(u[i]);
theta[i] = 2.0*M_PI*u[i+1];
}
for(int i = 0; i < n; i++){
X[i] = sqrt(-2.0*log(u[i]))*cos(2.0*M_PI*u[i+1]);
Y[i] = sqrt(-2.0*log(u[i]))*sin(2.0*M_PI*u[i+1]);
}
return X;
}
double *Beasley_Springer_Moro(int n, double u[]){
double y[n];
double r[n+1];
double *x = new double(n);
// Constants needed for algo
double a_0 = 2.50662823884; double b_0 = -8.47351093090;
double a_1 = -18.61500062529; double b_1 = 23.08336743743;
double a_2 = 41.39119773534; double b_2 = -21.06224101826;
double a_3 = -25.44106049637; double b_3 = 3.13082909833;
double c_0 = 0.3374754822726147; double c_5 = 0.0003951896511919;
double c_1 = 0.9761690190917186; double c_6 = 0.0000321767881768;
double c_2 = 0.1607979714918209; double c_7 = 0.0000002888167364;
double c_3 = 0.0276438810333863; double c_8 = 0.0000003960315187;
double c_4 = 0.0038405729373609;
// Set r and x to empty for now
for(int i = 0; i <= n; i++){
r[i] = 0.0;
x[i] = 0.0;
}
for(int i = 1; i <= n; i++){
y[i] = u[i] - 0.5;
if(fabs(y[i]) < 0.42){
r[i] = pow(y[i],2.0);
x[i] = y[i]*(((a_3*r[i] + a_2)*r[i] + a_1)*r[i] + a_0)/((((b_3*r[i] + b_2)*r[i] + b_1)*r[i] + b_0)*r[i] + 1);
}else{
r[i] = u[i];
if(y[i] > 0.0){
r[i] = 1.0 - u[i];
r[i] = log(-log(r[i]));
x[i] = c_0 + r[i]*(c_1 + r[i]*(c_2 + r[i]*(c_3 + r[i]*(c_4 + r[i]*(c_5 + r[i]*(c_6 + r[i]*(c_7 + r[i]*c_8)))))));
}
if(y[i] < 0){
x[i] = -x[i];
}
}
}
return x;
}
double phi(double x){
return 0.5 * erfc(-x * M_SQRT1_2);
}
void Anderson_Darling(int n, double X[]){
sort(X,X + n);
// Find the mean of X
double X_avg = 0.0;
double sum = 0.0;
for(int i = 0; i < n; i++){
sum += X[i];
}
X_avg = ((double)sum)/n;
// Find the variance of X
double X_sig = 0.0;
for(int i = 0; i < n; i++){
X_sig += (X[i] - X_avg)*(X[i] - X_avg);
}
X_sig /= (n-1);
// The values X_i are standardized to create new values Y_i
double Y[n];
for(int i = 0; i < n; i++){
Y[i] = (X[i] - X_avg)/(sqrt(X_sig));
//cout << Y[i] << endl;
}
// With a standard normal CDF, we calculate the Anderson_Darling Statistic
double A = -n;
for(int i = 0; i < n; i++){
A += -1.0/(double)n *(2*(i+1) - 1)*(log(phi(Y[i])) + log(1 - phi(Y[n - i])));
}
cout << A << endl;
}
Let me guess, your n was 2000. Right?
The major issue here is when you do 1/n in the last expression. 1 is an int and ao is n. When you divide 1 by n it performs integer division. Now 1 divided by any number > 1 is 0 under integer division (think if it as only keeping only integer part of the quotient. What you need to do is cast n as double by writing 1/(double)n.
Rest all should work fine.
Summary from discussions -
Indexes to Y[] should be i and n-1-i respectively.
n should not be added in the loop but only once.
Minor fixes like changing divisor to n instead of n-1 while calculating Variance.
You have integer division here:
A += -n - 1/n *(2*(i) - 1)*(log(phi(Y[i])) + log(1 - phi(Y[n+1 - i])));
^^^
1/n is zero when n > 1 - you need to change this to, e.g.: 1.0/n:
A += -n - 1.0/n *(2*(i) - 1)*(log(phi(Y[i])) + log(1 - phi(Y[n+1 - i])));
^^^^^

C++: What is wrong with the output?

This code should output 0 0.25 0.5 0.75 1, instead it outputs zeros. Why is that?
Define a function u(x)=x;
void pde_advect_IC(double* x, double* u)
{
int N = sizeof(x) / sizeof(x[0]); //size of vector u
for (int i = 0; i <= N; i++)
u[i] = x[i];
}
Here is the implementation:
int main()
{
double a = 0.0;
double b = 1.0;
int nx = 4;
double dx = (b - a) / double(nx);
double xx[nx + 1]; //array xx with intervals
// allocate memory for vectors of solutions u0
double* u0 = new double [nx + 1];
//fill in array x
for (int i = 0; i <= nx; i++)
xx[i] = a + double(i) * dx;
pde_advect_IC(xx, u0); // u0 = x (initial conditions)
for (int i = 0; i <= nx; i++)
cout<<u0[i]<<endl;
// de-allocate memory of u0
delete [] u0;
delete [] u1;
return 0;
}
You can't use sizeof(x) because that will return the size of the pointer, not the array you thought you passed to it. You have to specify the size with a third parameter or use something more convenient like an std::vector and use size().
This works.
#include <iostream>
#include <cstdlib>
using namespace std;
void pde_advect_IC(double* x, double* u, const int& N)
{
for (int i = 0; i < N; i++)
u[i] = x[i];
}
int main()
{
double a = 0.0;
double b = 1.0;
int nx = 4;
double dx = (b - a) / double(nx);
double xx[nx + 1]; //array xx with intervals
// allocate memory for vectors of solutions u0
double* u0 = new double [nx + 1];
//fill in array x
for (int i = 0; i <= nx; i++)
xx[i] = a + double(i) * dx;
pde_advect_IC(xx, u0, nx + 1); // u0 = x (initial conditions)
for (int i = 0; i <= nx; i++)
cout << u0[i] << endl;
// de-allocate memory of u0
delete [] u0;
return 0;
}
Note that I added const int& N to pde_advect_IC() in order to pass it the size of the array, by const reference, to be sure it does not get modified by mistake.
Note that your trick with sizeof() does not work with pointers.