C++ Advice on manipulating output Matrix data

C++ Advice on manipulating output Matrix data - c++

I have the following code.
Essentially it is creating N random normal variables, and running through an equation M times for a simulation.
The output should be an NxM matrix of data, however the only way I could do the calculation has the output as MxN. ie each M run should be a column, not a row.
I have attempted in vain to follow some of the other suggestions that have been posted on previous similar topics.
Code:
#include <iostream>
#include <time.h>
#include <random>
int main()
{
double T = 1; // End time period for simulation
int N = 4; // Number of time steps
int M = 2; // Number of simulations
double x0 = 1.00; // Starting x value
double mu = 0.00; // mu(x,t) value
double sig = 1.00; // sigma(x,t) value
double dt = T/N;
double sqrt_dt = sqrt(dt);
double** SDE_X = new double*[M]; // SDE Matrix setup
// Random Number generation setup
double RAND_N;
srand ((unsigned int) time(NULL)); // Generator loop reset
std::default_random_engine generator (rand());
std::normal_distribution<double> distribution (0.0,1.0); // Mean = 0.0, Variance = 1.0 ie Normal
for (int i = 0; i < M; i++)
{
SDE_X[i] = new double[N];
for (int j=0; j < N; j++)
{
RAND_N = distribution(generator);
SDE_X[i][0] = x0;
SDE_X[i][j+1] = SDE_X[i][j] + mu * dt + sig * RAND_N * sqrt_dt; // The SDE we wish to plot the path for
std::cout << SDE_X[i][j] << " ";
}
std::cout << std::endl;
}
std::cout << std::endl;
std::cout << " The simulation is complete!!" << std::endl;
std::cout << std::endl;
system("pause");
return 0;
}

Well why can't you just create the transpose of your SDE_X matrix then? Isn't that what you want to get?

Keep in mind, that presentation has nothing to do with implementation. Whether to access columns or rows is your decision. So you want an implementation of it transposed. Then quick and dirty create your matrix first, and then create your number series. Change i and j, and N and M.
I said quick and dirty, because the program at all is bad:
why don't you just keep it simple and use a better data structure for your matrix? If you know the size: compile-time array or dynamic vectors at runtime? Maybe there are some nicer implementation for 2d array.
There is a bug I think: you create N doubles and access index 0 to N inclusive.
In every iteration you set index 0 to x0 what is also needless.
I would change your code a bit make more clear:
create your matrix at first
initialize the first value of the matrix
provide an algorithm function calculating a target cell taking the matrix and the parameters.
Go through each cell and invoke your function for that cell

Thank you all for your input. I was able to implement my code and have it displayed as needed.
I added a second for loop to rearrange the matrix rows and columns.
Please feel free to let me know if you think there is anyway I can improve it.
#include <iostream>
#include <time.h>
#include <random>
#include <vector>
int main()
{
double T = 1; // End time period for simulation
int N = 3; // Number of time steps
int M = 2; // Number of simulations
int X = 100; // Max number of matrix columns
int Y = 100; // Max number of matrix rows
double x0 = 1.00; // Starting x value
double mu = 0.00; // mu(x,t) value
double sig = 1.00; // sigma(x,t) value
double dt = T/N;
double sqrt_dt = sqrt(dt);
std::vector<std::vector<double>> SDE_X((M*N), std::vector<double>((M*N))); // SDE Matrix setup
// Random Number generation setup
double RAND_N;
srand ((unsigned int) time(NULL)); // Generator loop reset
std::default_random_engine generator (rand());
std::normal_distribution<double> distribution (0.0,1.0); // Mean = 0.0, Variance = 1.0 ie Normal
for (int i = 0; i <= M; i++)
{
SDE_X[i][0] = x0;
for (int j=0; j <= N; j++)
{
RAND_N = distribution(generator);
SDE_X[i][j+1] = SDE_X[i][j] + mu * dt + sig * RAND_N * sqrt_dt; // The SDE we wish to plot the path for
}
}
for (int j = 0; j <= N; j++)
{
for (int i = 0; i <=M; i++)
{
std::cout << SDE_X[i][j] << ", ";
}
std::cout << std::endl;
}
std::cout << std::endl;
std::cout << " The simulation is complete!!" << std::endl;
std::cout << std::endl;
system("pause");
return 0;
}

Related

Optimize c++ Monte Carlo simulation with long dynamic arrays

This is my first post here and I am not that experienced, so please excuse my ignorance.
I am building a Monte Carlo simulation in C++ for my PhD and I need help in optimizing its computational time and performance. I have a 3d cube repeated in each coordinate as a simulation volume and inside every cube magnetic particles are generated in clusters. Then, in the central cube a loop of protons are created and move and at each step calculate the total magnetic field from all the particles (among other things) that they feel.
At this moment I define everything inside the main function and because I need the position of the particles for my calculations (I calculate the distance between the particles during their placement and also during the proton movement), I store them in dynamic arrays. I haven't used any class or function,yet. This makes my simulations really slow because I have to use eventually millions of particles and thousands of protons. Even with hundreds it needs days. Also I use a lot of for and while loops and reading/writing to .dat files.
I really need your help. I have spent weeks trying to optimize my code and my project is behind schedule. Do you have any suggestion? I need the arrays to store the position of the particles .Do you think classes or functions would be more efficient? Any advice in general is helpful. Sorry if that was too long but I am desperate...
Ok, I edited my original post and I share my full script. I hope this will give you some insight regarding my simulation. Thank you.
Additionally I add the two input files
parametersDiffusion_spher_shel.txt
parametersIONP_spher_shel.txt
#include <stdio.h>
#include <iostream>
#include <fstream>
#include <stdlib.h>
#include <math.h>
#include <iomanip> //precision to output
//#include <time.h>
#include <ctime>
#include <cstdlib>
#include <algorithm>
#include <string>
//#include <complex>
#include <chrono> //random generator
#include <random>
using namespace std;
#define PI 3.14159265
#define tN 500000 //# of timepoints (steps) to define the arrays ONLY
#define D_const 3.0E-9 //diffusion constant (m^2/s)
#define Beq 0.16 // Tesla
#define gI 2.6752218744E8 //(sT)^-1
int main(){
//Mersenne Twister random engine
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
//uniform_int_distribution<int> intDist(0,1);
uniform_real_distribution<double> realDist(0.,1.);
//for(int i=1; i<100; i++){
//cout<<"R max: "<<Ragg-Rspm<<" r_spm: "<<(Ragg-Rspm)*sqrt(realDist(rng))<<endl;
//}
/////////////////////////////////////////////////////////////////////////////////////////////////////////
//input files
double Rionp=1.0E-8, Ragg=2.0E-7, t_tot=2.0E-2, l_tot = 3.0E-4;
int ionpN=10, aggN=10,cubAxN=10, parN=1E5;
int temp_ionpN, temp_aggN, temp_cubAxN, temp_parN;
ifstream inIONP;
inIONP.open("parametersIONP_spher_shel.txt");
if(!inIONP){
cout << "Unable to open IONP parameters file";
exit(1); // terminate with error
}
while (inIONP>>Rionp>>Ragg>>temp_ionpN>>temp_aggN>>l_tot>>temp_cubAxN) {
ionpN = (int)temp_ionpN;
aggN = (int)temp_aggN;
cubAxN = (int)temp_cubAxN;
}
inIONP.close();
cout<<"Rionp: "<<Rionp<<" ionpN: "<<ionpN <<" aggN: "<<aggN<<endl;
cout<<"l_tot: "<<l_tot<<" cubAxN: "<<cubAxN<<endl;
ifstream indiff;
indiff.open("parametersDiffusion_spher_shel.txt");
if(!indiff){
cout << "Unable to open diffusion parameters file";
exit(1); // terminate with error
}
while (indiff>>temp_parN>>t_tot) {
parN = (int)temp_parN;
}
indiff.close();
cout<<"parN: "<<parN<<" t_tot: "<<t_tot<<endl;
/////////////////////////////////////////////////////////////////////////////////////////////////
int cubN = pow(cubAxN,3.); // total cubes
int Nionp_tot = ionpN*aggN*cubN; //total IONP
double f_tot = (double)Nionp_tot*(4.*PI*pow(Rionp,3.)/3.)/pow(l_tot,3.);//volume density
//central cube
double l_c = l_tot/(double)cubAxN;
int Nionp_c = ionpN*aggN; //SPM in central cube
double f_c = (double)Nionp_c*(4.*PI*pow(Rionp,3.)/3.)/pow(l_c,3.);
cout<<"f_tot: "<<f_tot<<" Nionp_tot: "<<Nionp_tot<<" l_tot "<<l_tot<<endl;
cout<<"f_c: "<<f_c<<" Nionp_c: "<<Nionp_c<<" l_c "<<l_c<<endl;
cout<<"Now IONP are generated..."<<endl;
//position of aggregate (spherical distribution IONP)
double *x1_ionp, *x2_ionp, *x3_ionp, *theta_ionp, *phi_ionp, *r_ionp, *x1_agg, *x2_agg, *x3_agg;
x1_ionp = new double [Nionp_tot];
x2_ionp = new double [Nionp_tot];
x3_ionp = new double [Nionp_tot];
theta_ionp = new double [Nionp_tot];
phi_ionp = new double [Nionp_tot];
r_ionp = new double [Nionp_tot];
x1_agg = new double [Nionp_tot];
x2_agg = new double [Nionp_tot];
x3_agg = new double [Nionp_tot];
int ionpCounter = 0;
int aggCounter = 0;
double x1_aggTemp=0., x2_aggTemp=0., x3_aggTemp=0.;
double ionpDist = 0.; //distance SPM-SPM
for(int a=0; a<cubAxN; a++){ //x1-filling cubes
for(int b=0; b<cubAxN; b++){ //x2-
for(int c=0; c<cubAxN; c++){ //x3-
bool far_ionp = true;
cout<<"cube: (a, b, c): ("<<a<<", "<<b<<", "<<c<<")"<<endl;
for(int i=0; i<aggN; i++){ //aggregate iterations
x1_aggTemp=realDist(rng)*l_c + l_c*a - l_tot/2.; //from neg to pos filling
x2_aggTemp=realDist(rng)*l_c + l_c*b - l_tot/2.;
x3_aggTemp=realDist(rng)*l_c + l_c*c - l_tot/2.;
for(int j=0; j<ionpN; j++){ //SPM iterations
// cout<<"SPM: "<<j<<" aggregate: "<<i<<" cube: (a, b, c): ("<<a<<", "<<b<<", "<<c<<")"<<endl;
x1_agg[ionpCounter]=x1_aggTemp;
x2_agg[ionpCounter]=x2_aggTemp;
x3_agg[ionpCounter]=x3_aggTemp;
//uniform 4pi distribution in sphere
while(true){
far_ionp = true; //must be updated!
theta_ionp[ionpCounter] = 2.*PI*realDist(rng);
phi_ionp[ionpCounter] = acos(1. - 2.*realDist(rng));
r_ionp[ionpCounter] = (Ragg-Rionp)*sqrt(realDist(rng)); // to have uniform distribution sqrt
x1_ionp[ionpCounter] = sin(phi_ionp[ionpCounter])*cos(theta_ionp[ionpCounter])*r_ionp[ionpCounter] + x1_agg[ionpCounter];
x2_ionp[ionpCounter] = sin(phi_ionp[ionpCounter])*sin(theta_ionp[ionpCounter])*r_ionp[ionpCounter] + x2_agg[ionpCounter];
x3_ionp[ionpCounter] = cos(phi_ionp[ionpCounter])*r_ionp[ionpCounter] + x3_agg[ionpCounter];
for(int m=0; m<ionpCounter; m++){ //impenetrable IONP to each other
ionpDist = sqrt(pow(x1_ionp[m]-x1_ionp[ionpCounter],2.)+pow(x2_ionp[m]-x2_ionp[ionpCounter],2.)+pow(x3_ionp[m]-x3_ionp[ionpCounter],2.));
//cout<<"spmDist: "<<spmDist<<endl;
if((j>0) && (ionpDist <= 2*Rionp)){
far_ionp = false;
cout<<"CLOSE ionp-ionp! Distanse ionp-ionp: "<<ionpDist<<endl;
}
}
if(far_ionp){
cout<<"IONP can break now! ionpCounter: "<<ionpCounter<<endl;
break;
}
}
cout<<"r_ionp: "<<r_ionp[ionpCounter]<<" x1_ionp: "<<x1_ionp[ionpCounter]<<" x2_ionp: "<<x2_ionp[ionpCounter]<<" x3_ionp: "<<x3_ionp[ionpCounter]<<endl;
cout<<"x1_agg: "<<x1_agg[ionpCounter]<<" x2_agg: "<<x2_agg[ionpCounter]<<" x3_agg: "<<x3_agg[ionpCounter]<<endl;
ionpCounter++;
}
aggCounter++;
}
}
}
}
cout<<"ionpCounter: "<<ionpCounter<<" aggCounter: "<<aggCounter<<endl;
//=====proton diffusion=============//
//outfile
//proton diffusion time-positionSPM_uniform
FILE *outP_tPos;
outP_tPos = fopen("V3_MAT_positionProtons_spherical.dat","wb+");
if(!outP_tPos){// file couldn't be opened
cerr << "Error: file could not be opened" << endl;
exit(1);
}
//proton diffusion time-positionSPM_uniform
FILE *outP_tB;
outP_tB = fopen("V3_MAT_positionB_spherical.dat","wb+");
if(!outP_tB){// file couldn't be opened
cerr << "Error: file could not be opened" << endl;
exit(1);
}
double *cosPhase_S, *sinPhase_S, *m_tot;
cosPhase_S = new double [tN];
sinPhase_S = new double [tN];
m_tot = new double [tN];
double tstep = 0.; // time of each step
int stepCounter = 0; // counter for the steps for each proton
int cnt_stpMin=0; //, cnt_stpMax=0; //counters for the step length conditions
for (int i=0; i<parN; i++){// repetition for all the protons in the sample
stepCounter = 0; //reset
cout<<"Now diffusion calculated for proton: "<<i<<endl;
double x0[3]={0.}, xt[3]={0.}, vt[3]={0.};
double tt=0.;
double stepL_min = Rionp/8.; //min step length
double stepL_max = Rionp; //max step length
double stepL = 0.;
double extraL = 0.; //extra length beyond central cube
bool hit_ionp = false;
double pIONPDist = 0.; // proton-IONP Distance (vector ||)
double pIONPCosTheta = 0.; //proton_IONP vector cosTheta with Z axis
double Bloc = 0.; //B 1 IONP
double Btot = 0.; //SUM B all IONP
double Dphase = 0.; //Delta phase for step 1p
double phase = 0.; //phase 1p
double theta_p=0., phi_p=0.;
//randomized initial position of the particle;
x0[0] = realDist(rng)*l_c - l_c/2.;
x0[1] = realDist(rng)*l_c - l_c/2.;
x0[2] = realDist(rng)*l_c - l_c/2.;
//for (int j=0; j<tN; j++){ //steps
bool diffTime = true; // flag protons are allowed to diffuse (tt<10ms)
while(diffTime){ // steps loop
//unit vector for 4p direction
theta_p = 2.*PI*realDist(rng);
phi_p = acos(1. - 2.*realDist(rng));
vt[0] = sin(phi_p)*cos(theta_p);
vt[1] = sin(phi_p)*sin(theta_p);
vt[2] = cos(phi_p);;
//determine length of step
for(int k=0; k<ionpCounter; k++){
if(abs(sqrt(pow(x1_ionp[k]-x0[0],2.)+pow(x2_ionp[k]-x0[1],2.)+pow(x3_ionp[k]-x0[2],2.))-Rionp) <= 8*Rionp){
//spm closer than 8R
stepL = Rionp/8;
cnt_stpMin ++;
break;
}
else if(abs(sqrt(pow(x1_ionp[k]-x0[0],2.)+pow(x2_ionp[k]-x0[1],2.)+pow(x3_ionp[k]-x0[2],2.))-Rionp) > 8*Rionp){
stepL = Rionp;
}
else{
cout<<"sth wrong with the proton-IONP distance!"<<endl;
}
}
//determine Dt step duration
tstep = pow(stepL,2.)/(6.*D_const);
tt += tstep;
if(tt>t_tot){
diffTime = false; //proton is not allowed to diffuse any longer
cout<<"Proton id: "<<i<<" has reached diffusion time! -> Move to next one!"<<endl;
cout<<"stepCounter: "<<stepCounter<<" cnt_stpMin: "<<cnt_stpMin<<endl;
}
while(true){
xt[0]=x0[0]+vt[0]*stepL;
xt[1]=x0[1]+vt[1]*stepL;
xt[2]=x0[2]+vt[2]*stepL;
for(int m=0; m<3; m++){
if(abs(xt[m]) > l_c/2.){ //particle outside central cube,// reflected, elastic collision(no!)
//particle enters fron the other way, periodicity
// hit_cx[m] = true; //I don't need it yet
extraL = abs(xt[m]) - l_c/2.;
// xt[m]=-x0[m];
cout<<"proton outside! xt[m]: "<<xt[m]<<" extra lenght: "<<extraL<<endl;
xt[m] = xt[m]-l_c;
cout<<"Relocating => new x[t]: "<<xt[m]<<endl;
}
}
for(int k=0; k<ionpCounter; k++){//check if proton inside SPM
pIONPDist = sqrt(pow((x1_ionp[k]-xt[0]),2.)+pow((x2_ionp[k]-xt[1]),2.)+pow((x3_ionp[k]-xt[2]),2.)) - Rionp;
if(pIONPDist <= 0.){
cout<<"proton inside IONP => reposition! Distance: "<<pIONPDist<<" Rionp: "<<Rionp<<endl;
hit_ionp = true;
}
else if(pIONPDist > 0.){
hit_ionp=false; //with this I don't have to reset flag in the end
//calculations of Bloc for this position
pIONPCosTheta = (x3_ionp[k]-xt[2])/pIONPDist;
Bloc = pow(Rionp,3.)*Beq*(3.*pIONPCosTheta - 1.)/pow(pIONPDist,3.);
Btot += Bloc;
//cout<<"pSPMDist: "<<pSPMDist<<" pSPMCosTheta: "<<pSPMCosTheta<<" Bloc: "<<Bloc<<" Btot: "<<Btot<<endl;
}
else{
cout<<"Something wrong with the calculation of pIONPDist! "<<pIONPDist<<endl;
hit_ionp = true;
}
}
if(!hit_ionp){
// hit_spm=false; //reset flag (unnessesary alreaty false)
break;
}
}// end of while for new position -> the new position is determined, Btot calculated
// Dphase, phase
Dphase = gI*Btot*tstep;
phase += Dphase;
//store phase for this step
//filled for each proton at this timepoint (step)
cosPhase_S[stepCounter] += cos(phase);
sinPhase_S[stepCounter] += sin(phase);
//reset Btot
Btot = 0.;
stepCounter++;
} //end of for loop step
} //end of for loop particles
//-----calculate the <m> the total magnetization
for(int t=0; t<tN; t++){
m_tot[t] = sqrt(pow(cosPhase_S[t],2.) + pow(sinPhase_S[t],2.))/(double)parN;
//cout<<"m_tot[t]: "<<m_tot[t]<<endl;
}
fclose(outP_tPos); //proton time-position
fclose(outP_tB); //proton time-B
//====== outfile data=============//
//----- output data of SPM position---------//
FILE *outP_S;
outP_S = fopen("V3_MAT_positionSPM_spherical.dat","wb+");
if(!outP_S){// file couldn't be opened
cerr << "Error: file could not be opened" << endl;
exit(1);
}
for (int i=0; i<ionpCounter; ++i){
fprintf(outP_S,"%.10f \t %.10f \t %.10f\n",x1_ionp[i],x2_ionp[i],x3_ionp[i]);
}
fclose(outP_S);
FILE *outP_agg;
outP_agg = fopen("V3_MAT_positionAggreg_spherical.dat","wb+");
if(!outP_agg){// file couldn't be opened
cerr << "Error: file could not be opened" << endl;
exit(1);
}
for (int j=0; j<ionpCounter; ++j){
fprintf(outP_agg,"%.10f \t %.10f \t %.10f\n",x1_agg[j],x2_agg[j],x3_agg[j]);
}
fclose(outP_agg);
FILE *outSngl;
outSngl = fopen("V3_MAT_positionSingle_spherical.dat","wb+");
if(!outSngl){// file couldn't be opened
cerr << "Error: file could not be opened" << endl;
exit(1);
}
int findAgg = (int)(realDist(rng)*aggN);
int idxMin = findAgg*ionpN;
int idxMax = idxMin + ionpN;
for (int k=idxMin; k<idxMax; ++k){
fprintf(outSngl,"%.10f\t%.10f\t%.10f\t%.10f\t%.10f\t%.10f\n",x1_agg[k],x2_agg[k],x3_agg[k],x1_ionp[k],x2_ionp[k],x3_ionp[k]);
}
fclose(outSngl);
//delete new arrays
delete[] x1_ionp;
delete[] x2_ionp;
delete[] x3_ionp;
delete[] theta_ionp;
delete[] phi_ionp;
delete[] r_ionp;
delete[] x1_agg;
delete[] x2_agg;
delete[] x3_agg;
delete[] cosPhase_S;
delete[] sinPhase_S;
delete[] m_tot;
}

I talked the problem in more steps, first thing I made the run reproducible:
mt19937 rng(127386261); //I want a deterministic seed
Then I create a script to compare the three output files generated by the program:
#!/bin/bash
diff V3_MAT_positionAggreg_spherical.dat V3_MAT_positionAggreg_spherical2.dat
diff V3_MAT_positionSingle_spherical.dat V3_MAT_positionSingle_spherical2.dat
diff V3_MAT_positionSPM_spherical.dat V3_MAT_positionSPM_spherical2.dat
Where the files ending in two is created by the optimized code and the other by your version.
I run your version compiling with O3 flag and marked the time (for 20 magnetic particles and 10 protons it is taking 79 seconds on my box, my architecture is not that important because we are just going to compare the differences).
Then I start refactoring steps by steps, running every small changes comparing the output files and the time, here are all the iterations:
Remove redundant else if gain 5 seconds (total run 74.0 s)
if(sqrt(pow(x1_ionp[k]-x0[0],2.)+pow(x2_ionp[k]-x0[1],2.)+pow(x3_ionp[k]-x0[2],2.)) <= 7*Rionp){
//spm closer than 8R
stepL = Rionp/8;
cnt_stpMin ++;
break;
}
else { //this was an else if and an else for error that will never happen
stepL = Rionp;
}
At this point, I run it under the profiler and pow function stood out.
Replacing pow with square and cube gain 61 seconds (total run 13.2 s)
Simply replacing pow(x,2.) with square(x) and pow(x,3.) with cube(x) will reduce the run time by about 600%
double square(double d)
{
return d*d;
}
double cube(double d)
{
return d*d*d;
}
Now the gain is reduced quite a lot for each improvement, but still.
Remove redundant sqrt gain (total run 12.9 s)
double ionpDist = square(x1_ionp[m]-x1_ionp[ionpCounter])+square(x2_ionp[m]-x2_ionp[ionpCounter])+square(x3_ionp[m]-x3_ionp[ionpCounter]);
//cout<<"spmDist: "<<spmDist<<endl;
if((j>0) && (ionpDist <= 4*square_Rionp)){
Introducing const variable square_Rionp and cube_Rionp (total run 12.7 s)
const double square_Rionp = square(Rionp);
const double cube_Rionp = cube(Rionp);
//replaced in the code like this
if((j>0) && (ionpDist <= 4*square_Rionp)){
Introducing variable for pi (total run 12.6 s)
const double Two_PI = PI*2.0;
const double FourThird_PI = PI*4.0/3.0;
Remove a (another) redundant else if (total run 11.9s)
if(pIONPDist <= 0.){
cout<<"proton inside IONP => reposition! Distance: "<<pIONPDist<<" Rionp: "<<Rionp<<endl;
hit_ionp = true;
}
else { //this was an else if without any reason
hit_ionp=false; //with this I don't have to reset flag in the end
//calculations of Bloc for this position
pIONPCosTheta = (x3_ionp[k]-xt[2])/pIONPDist;
...
}
Remove another redundant sqrare root (total run 11.2 s)
const double Seven_Rionp_squared =square(7*Rionp);
...
for(int k=0; k<ionpCounter; k++){
if(square(x1_ionp[k]-x0[0])+square(x2_ionp[k]-x0[1])+square(x3_ionp[k]-x0[2]) <= Seven_Rionp_squared){
//spm closer than 8R
stepL = stepL_min;
cnt_stpMin ++;
break;
}
I don't see many more things obvious to squeeze more performance out of it. Further optimization may require some thinking.
I did another comparison run with 50 magnetic particles and 10 protons and I have found that my version is 7 times faster then the yours and it is producing the exact same files.
I would do this exercise with the help of source control.
Your code is trivially parallelizable, but I will go that route just when you have optimized the single thread version.
EDIT
Change += with = operator (total run 6.23 s)
I have noticed that += operator is used for no reason, the substitution to operator = is a substanzial gain:
cosPhase_S[stepCounter] = cos(phase);
sinPhase_S[stepCounter] = sin(phase);

How to generate a n-sized random float array that sums up to 0.0?

Consider that I need a n-sized vector where each element is defined between [-1,1]. The element a[i] is a float generated by -1 + 2*rand(). I need a elegant way to ensure that the sum of the elements of my array is equal to zero.
I've found two possible solutions:
The first one is this matlab function https://www.mathworks.com/matlabcentral/fileexchange/9700-random-vectors-with-fixed-sum. It has also a implementation in R, however it is too much work to implement it on C, since this function is used for a 2d array.
The second one is provided in this thread here: Generate random values with fixed sum in C++. Essentially, the idea is to generate n numbers with a normal distribution then normalize them to with my sum. (I have implemented it using python bellow) for a vector with sum up to 1.0. It works for every sum value except for zero.
import random as rd
mySum = 1;
randomVector = []
randomSum = 0
for i in range(7):
randomNumber = -1 + 2*rd.random()
randomVector.append(randomNumber)
randomSum += randomNumber
coef = mySum/randomSum
myNewList = [j * coef for j in randomVector]
newsum = sum(myNewList)
So, is there a way to do that using C or C++? If you know a already implemented function it would be awesome. Thanks.

I figured out a solution to your problem. This is not perfect since its randomness is limited by the range requirement.
The strategy is:
Define a function able to generate a random float in a customizable range. No need to reinvent the wheel: I borrowed it from https://stackoverflow.com/a/44105089/11336762
Malloc array (I omit pointer check in my example) and initialize the seed. In my example I just used current time but it can be improved
For every element to be generated, pre-calculate random range. Given the i-th sum, make sure that the next sum is NEVER out of range: if the sum is positive, the range needs to be (-1,1-sum); if it is negative it the range needs to be (-1-sum,1)
Do this until (n-1)th element. Last element must be directly assigned as the sum with the sign changed.
#include<stdio.h>
#include<stdlib.h>
#include<time.h>
float float_rand( float min, float max )
{
float scale = rand() / (float) RAND_MAX; /* [0, 1.0] */
return min + scale * ( max - min ); /* [min, max] */
}
void main( int argc, char *argv[] )
{
if( argc == 2 )
{
int i, n = atoi ( argv[1] );
float *outArr = malloc( n * sizeof( float ) );
float sum = 0;
printf( "Input value: %d\n\n", n );
/* Initialize seed */
srand ( time( NULL ) );
for( i=0; i<n-1; i++ )
{
/* Limit random generation range in order to make sure the next sum is *
* not outside (-1,1) range. */
float min = (sum<0? -1-sum : -1);
float max = (sum>0? 1-sum : 1);
outArr[i] = float_rand( min, max );
sum += outArr[i];
}
/* Set last array element */
outArr[n-1] = -sum;
/* Print results */
sum=0;
for( i=0; i<n; i++ )
{
sum += outArr[i];
printf( " outArr[%d]=%f \t(sum=%f)\n", i, outArr[i], sum );
}
free( outArr );
}
else
{
printf( "Only a parameter allowed (integer N)\n" );
}
}
I tried it, and it works also when n=1. In case of n=0 a sanity check should be added to my example.
Some output examples:
N=1:
Input value: 1
outArr[0]=-0.000000 (sum=-0.000000)
N=4
Input value: 4
outArr[0]=-0.804071 (sum=-0.804071)
outArr[1]=0.810685 (sum=0.006614)
outArr[2]=-0.353444 (sum=-0.346830)
outArr[3]=0.346830 (sum=0.000000)
N=8:
Input value: 8
outArr[0]=-0.791314 (sum=-0.791314)
outArr[1]=0.800182 (sum=0.008867)
outArr[2]=-0.571293 (sum=-0.562426)
outArr[3]=0.293300 (sum=-0.269126)
outArr[4]=-0.082886 (sum=-0.352012)
outArr[5]=0.818639 (sum=0.466628)
outArr[6]=-0.301473 (sum=0.165155)
outArr[7]=-0.165155 (sum=0.000000)

Thank you guys again for the help.
So, based on the idea of Cryostasys I developed the following C code to solve my problem:
#include <stdio.h> /* printf, scanf, puts, NULL */
#include <stdlib.h> /* srand, rand */
#include <time.h> /* time */
#include <math.h>
int main()
{
int arraySize = 10; //input value
double createdArray[arraySize]; //output value
double randomPositiveVector[arraySize];
double randomNegativeVector[arraySize];
double positiveSum = 0.;
double negativeSum = 0.;
srand(time(NULL)); //seed for random generation
for(int i = 0; i < arraySize; ++i)
{
double randomNumber = -1.+2.*rand()/((double) RAND_MAX); //random in [-1.0,1.0]
printf("%f\n",randomNumber);
if(randomNumber >=0)
{
randomPositiveVector[i] = randomNumber;
positiveSum += randomNumber;
}
else
{
randomNegativeVector[i] = randomNumber;
negativeSum += randomNumber;
}
}
if(positiveSum == 0. || negativeSum == 0.) printf("ERROR\n");
double positiveCoefficient = 1.0/positiveSum;
double negativeCoefficient = -1.0/negativeSum;
for(int i = 0; i < arraySize; ++i)
{
randomPositiveVector[i] = positiveCoefficient * randomPositiveVector[i];
randomNegativeVector[i] = negativeCoefficient * randomNegativeVector[i];
if(fabs(randomPositiveVector[i]) > 1e-6) //near to zero
{
createdArray[i] = randomPositiveVector[i];
}
else
{
createdArray[i] = randomNegativeVector[i];
}
}
for(int i = 0; i < arraySize; ++i)
{
printf("createdArray[%d] = %9f\n",i,createdArray[i]);
}
return(0);
}
Please note that the randomness of the values generated is decreased, as mentioned in the comments of the question. Also, the kind of random distribution is determined by the function that you use to generate the randomNumber above. In this case, I've used rand() from stdlib.h which is based on giving a seed to the function and it is going to generate a pseudo-random number. You could use a different option, for instance, drand48() from stdlib.h as well.
Nevertheless, it is required that at least one positive and one negative value is generated in order to this code work. One verification step was added to the code, and if it reaches this condition one should run again the code or do something about.
Output example (arraySize = 10):
createdArray[0] = -0.013824
createdArray[1] = 0.359639
createdArray[2] = -0.005851
createdArray[3] = 0.126829
createdArray[4] = -0.334745
createdArray[5] = -0.473096
createdArray[6] = -0.172484
createdArray[7] = 0.249523
createdArray[8] = 0.262370
createdArray[9] = 0.001640

One option is to generate some samples and then scale their values around the average. In C++ it would be something like the following
#include <iostream>
#include <iomanip>
#include <random>
#include <algorithm>
#include <cmath>
int main()
{
std::random_device rd;
std::seed_seq ss{rd(), rd(), rd(), rd()};
std::mt19937 gen{ss};
const int samples = 9;
// Generates the samples in [0, 2]
std::uniform_real_distribution dist(0.0, std::nextafter(2.0, 3.0));
std::vector<double> nums(samples);
double sum = 0.0;
for ( auto & i : nums )
{
i = dist(gen);
sum += i;
}
double average = sum / samples;
double k = 1.0 / std::max(average, 2.0 - average);
// Transform the values (apart from the last) to meet the requirements
sum = 0.0;
for ( size_t i = 0; i < nums.size() - 1; ++i )
{
nums[i] = (nums[i] - average) * k;
sum += nums[i];
};
// This trick (to ensure the needed precision) only works if the sum
// is always evaluated in the same order
nums.back() = 0.0 - sum;
sum = 0.0;
for ( size_t i = 0; i < nums.size(); ++i )
{
sum += nums[i];
std::cout << std::setw(10) << std::fixed << nums[i] << '\n';
}
if (sum != 0.0)
std::cout << "Failed.\n";
}
Testable here.

Trying to compute e^x when x_0 = 1

I am trying to compute the Taylor series expansion for e^x at x_0 = 1. I am having a very hard time understanding what it really is I am looking for. I am pretty sure I am trying to find a decimal approximation for when e^x when x_0 = 1 is. However, when I run this code when x_0 is = 0, I get the wrong output. Which leads me to believe that I am computing this incorrectly.
Here is my class e.hpp
#ifndef E_HPP
#define E_HPP
class E
{
public:
int factorial(int n);
double computeE();
private:
int fact = 1;
int x_0 = 1;
int x = 1;
int N = 10;
double e = 2.718;
double sum = 0.0;
};
Here is my e.cpp
#include "e.hpp"
#include <cmath>
#include <iostream>
int E::factorial(int n)
{
if(n == 0) return 1;
for(int i = 1; i <= n; ++i)
{
fact = fact * i;
}
return fact;
}
double E::computeE()
{
sum = std::pow(e,x_0);
for(int i = 1; i < N; ++i)
{
sum += ((std::pow(x-x_0,i))/factorial(i));
}
return e * sum;
}
In main.cpp
#include "e.hpp"
#include <iostream>
#include <cmath>
int main()
{
E a;
std::cout << "E calculated at x_0 = 1: " << a.computeE() << std::endl;
std::cout << "E Calculated with std::exp: " << std::exp(1) << std::endl;
}
Output:
E calculated at x_0 = 1: 7.38752
E calculated with std::exp: 2.71828
When I change to x_0 = 0.
E calculated at x_0 = 0: 7.03102
E calculated with std::exp: 2.71828
What am I doing wrong? Am I implementing the Taylor Series incorrectly? Is my logic incorrect somewhere?

Yeah, your logic is incorrect somewhere.
Like Dan says, you have to reset fact to 1 each time you calculate the factorial. You might even make it local to the factorial function.
In the return statement of computeE you are multiplying the sum by e, which you do not need to do. The sum is already the taylor approximation of e^x.
The taylor series for e^x about 0 is sum _i=0 ^i=infinity (x^i / i!), so x_0 should indeed be 0 in your program.
Technically your computeE computes the right value for sum when you have x_0=0, but it's kind of strange. The taylor series starts at i=0, but you start the loop with i=1. However, the first term of the taylor series is x^0 / 0! = 1 and you initialize sum to std::pow(e, x_0) = std::pow(e, 0) = 1 so it works out mathematically.
(Your computeE function also computed the right value for sum when you had x_0 = 1. You initialized sum to std::pow(e, 1) = e, and then the for loop didn't change its value at all because x - x_0 = 0.)
However, as I said, in either case you don't need to multiply it by e in the return statement.
I would change the computeE code to this:
double E::computeE()
{
sum = 0;
for(int i = 0; i < N; ++i)
{
sum += ((std::pow(x-x_0,i))/factorial(i));
cout << sum << endl;
}
return sum;
}
and set x_0 = 0.

"fact" must be reset to 1 each time you calculate factorial. It should be a local variable instead of a class variable.
When "fact" is a class varable, and you let "factorial" change it to, say 6, that means that it will have the vaule 6 when you call "factorial" a second time. And this will only get worse. Remove your declaration of "fact" and use this instead:
int E::factorial(int n)
{
int fact = 1;
if(n == 0) return 1;
for(int i = 1; i <= n; ++i)
{
fact = fact * i;
}
return fact;
}

Write less code.
Don't use factorial.
Here it is in Java. You should have no trouble converting this to C++:
/**
* #link https://stackoverflow.com/questions/46148579/trying-to-compute-ex-when-x-0-1
* #link https://en.wikipedia.org/wiki/Taylor_series
*/
public class TaylorSeries {
private static final int DEFAULT_NUM_TERMS = 50;
public static void main(String[] args) {
int xmax = (args.length > 0) ? Integer.valueOf(args[0]) : 10;
for (int i = 0; i < xmax; ++i) {
System.out.println(String.format("x: %10.5f series exp(x): %10.5f function exp(x): %10.5f", (double)i, exp(i), Math.exp(i)));
}
}
public static double exp(double x) {
return exp(DEFAULT_NUM_TERMS, x);
}
// This is the Taylor series for exp that you want to port to C++
public static double exp(int n, double x) {
double value = 1.0;
double term = 1.0;
for (int i = 1; i <= n; ++i) {
term *= x/i;
value += term;
}
return value;
}
}

Reading 2 CSV files and using vectors to store the values and then calculate the coefficient. Returns -1.#IND

This is my code I have, which when I build it works and creates the .exe file, however throughout the process I print the function (i.e. mean, covarience, coefficient) and they all come back as -1#IND. Think it may not be pulling in the data from the CSV files correctly?
// basic file operations
#include <iterator>
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <string>
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
using namespace std;
typedef vector<double> Prices;
Prices parse_csv_line(string& line)
{
Prices result;
string datum;
stringstream ss(line);
int count=0;
while(getline(ss,datum,','))
{
// convert string to
count++;
if (count%2 == 0)
result.push_back(atof(datum.c_str()));
}
return result;
}
Prices parse_csv_file(const char* filename)
{
ifstream file(filename);
Prices prices;
string line;
// This will discard the header line
getline(file, line);
// This will get each line in the file, and collate its values
while (getline(file, line))
{
Prices v = parse_csv_line(line);
prices.insert(prices.end(), v.begin(), v.end());
}
for(Prices::iterator it=prices.begin(); it != prices.end(); it++)
cout << " " << *it;
return prices;
}
//Calculate Correlation of series A and B, then return
/* Calculatethe mean averages for A and B.
(For each series, add each sample and then divide by the number of samples.) */
double CalculateMean(Prices x)
{
double sum = 0;
for(size_t i = 0; i < x.size(); i++)
sum += x[i];
return (sum / x.size());
}
/* Calculate the variance for A and B.
(First calculate the difference from the mean for each sample number. Square each number then divide by the number of samples (n).
If the numbers you are calculating represent a sample of a larger group, then you would divide by n – 1.) */
double CalculateVariance(Prices x)
{
double mean = CalculateMean(x);
double temp = 0;
for(size_t i = 0; i < x.size(); i++)
{
temp += (x[i] - mean) * (x[i] - mean) ;
}
return temp / x.size();
}
/* calculate the standard deviation for A and B, which is the square root of the variance.
(This number will tell you how closely your samples are located to the mean.) */
double Calculate_StandardDeviation(Prices x)
{
return sqrt(CalculateVariance(x));
}
/* Lastly, calculate the Covariance of the 2 series.
(This value can be used to represent the linear relationship between two variables.) */
double Calculate_Covariance(Prices x, Prices y)
{
double meanX = CalculateMean(x);
double meanY = CalculateMean(y);
cout << "mean x = " << meanX << "\n";
cout << "mean y = " << meanY << "\n";
double total = 0;
for(size_t i = 0; i < x.size(); i++)
{
total += (x[i] - meanX) * (y[i] - meanY);
}
return total / x.size();
}
// Using the calculated values, these can then be inputted into the Correlation Coefficient formula to find the correlation of series A and B.
double Calculate_Correlation(Prices x, Prices y)
{
double covariance = Calculate_Covariance(x, y);
cout << "covariance =" << covariance << "\n";
double correlation = covariance / (Calculate_StandardDeviation(x) * Calculate_StandardDeviation(y));
return correlation;
};
int main()
{
Prices a = parse_csv_file("PC1_A.CSV");
Prices b = parse_csv_file("PC1_B.CSV");
double correlation = Calculate_Correlation(a, b);
cout << "Correlation is: " << correlation;
cin.get();
}

Numerical solution to differential equations in C++, path to take?

Edit
I am now using odeint. It is fairly simple to use and less memory hungry than my brute force algorithm implementation.
Check my questions here-->http://stackoverflow.com/questions/12060111/using-odeint-function-definition
and here-->http://stackoverflow.com/questions/12150160/odeint-streaming-observer-and-related-questions
I am trying to implement a numerical method (Explicit Euler) to solve a set of three coupled differential equations. I have worked with C before, but that was a very long time ago (effectively forgotten everything). I have a pretty good idea on what I want my program to do and also have a rough algorithm.
I am interested in using C++ for this task (picked up Stroustroup's Programming: Principles and Practice using C++). My question is, should I go with arrays or vectors? Vectors seem easier to handle, but I was unable to find how you can make a function return a vector? Is it possible for a function to return more than one vector? At this point, I am familiarizing myself with the C++ syntax.
I basically need my function to return many arrays. I realize that it is not possible in C++, so I can also work with some nested structure such as {{arr1},{arr2},{arr3}..}. Please bear with me as I am a noob and come from programming in Mathematica.
Thanks!

If you want to use C++ for integrating ordinary differential equations and you don't want to reinvent the wheel use odeint. This lib is on its way of becoming the de facto standard for solving ODEs in C++. The code is very flexible and highly optimized and can compete with any handcrafted C-code (and Fortran anyway).
Commenting on you question on returning vectors or arrays: Functions can return vectors and arrays if the are wrapped in a class (like std::array). But this is not recommended, since you make many unnecessary copies (incl. calling the constructors and destructors every time).
I assume you want to put your function equation into a c++ function and let it return the resulting vector. For this task it's much better if you pass a reference to a vector to the function and let the function fill this vector. This is also the way how odeint has implemented this.

This link might help you, but for ordinary differential equations :
http://www.codeproject.com/KB/recipes/odeint.aspx

To make the program do what you wan you could take a look at this code, It may be get you started.
I found it very useful, and tested it against mathematica solution, and it is ok.
for more information go here
/*
A simple code for option valuation using the explicit forward Euler method
for the class Derivative Securities, fall 2010
http://www.math.nyu.edu/faculty/goodman/teaching/DerivSec10/index.html
Written for this purpose by Jonathan Goodman, instructor.
Assignment 8
*/
#include <iostream>
#include <fstream>
#include <math.h>
#define NSPOTS 100 /* The number of spot prices computed */
/* A program to compute a simple binomial tree price for a European style put option */
using namespace std;
// The pricer, main is at the bottom of the file
void FE( // Solve a pricing PDE using the forward Euler method
double T, double sigma, double r, double K, // The standard option parameters
double Smin, double Smax, // The min and max prices to return
int nPrices, // The number of prices to compute between Smin and Smax,
// Determines the accuracy and the cost of the computation
double prices[], // An array of option prices to be returned.
double intrinsic[], // The intrinsic value at the same prices
double spots[], // The corresponding spot prices, computed here for convenience.
// Both arrays must be allocated in the calling procedure
double *SEarly ) { // The early exercise boundary
// Setup for the computation, compute computational parameters and allocate the memory
double xMin = log(Smin); // Work in the log variable
double xMax = log(Smax);
double dx = ( xMax - xMin )/ ( (double( nPrices - 1 ) ) ); // The number of gaps is one less than the number of prices
double CFL = .8; // The time step ratio
double dt = CFL*dx*dx/sigma; // The forward Euler time step size, to be adjusted slightly
int nTimeSteps = (int) (T/dt); // The number of time steps, rounded down to the nearest integer
nTimeSteps++; // Now rounded up
dt = T / ( (double) nTimeSteps ); // Adjust the time step to land precisely at T in n steps.
int nx = nPrices + 2*nTimeSteps; // The number of prices at the final time, growing by 2 ...
// ... each time step
xMin = xMin - nTimeSteps*dx; // The x values now start here
double *fOld; // The values of the pricing function at the old time
fOld = new double [nx]; // Allocated using old style C++ for simplicity
double *fNew; // The values of the pricing function at the new time
fNew = new double [nx];
double *V; // The intrinsic value = the final condition
V = new double [nx];
// Get the final conditions and the early exercise values
double x; // The log variable
double S; // A stock price = exp(x)
int j;
for ( j = 0; j < nx; j++ ) {
x = xMin + j*dx;
S = exp(x);
if ( S < K ) V[j] = K-S; // A put struck at K
else V[j] = 0;
fOld[j] = V[j]; // The final condition is the intrinsic value
}
// The time stepping loop
double alpha, beta, gamma; // The coefficients in the finite difference formula
alpha = beta = gamma = .333333333333; // XXXXXXXXXXXXXXXXXXXXXXXXXXX
int jMin = 1; // The smallest and largest j ...
int jMax = nx - 1; // ... for which f is updated. Skip 1 on each end the first time.
int jEarly ; // The last index of early exercise
for ( int k = nTimeSteps; k > 0; k-- ) { // This is, after all, a backward equation
jEarly = 0; // re-initialize the early exercise pointer
for ( j = jMin; j < jMax; j++ ) { // Compute the new values
x = xMin + j*dx; // In case the coefficients depend on S
S = exp(x);
fNew[j] = alpha*fOld[j-1] + beta*fOld[j] + gamma*fOld[j+1]; // Compute the continuation value
if ( fNew[j] < V[j] ) {
fNew[j] = V[j]; // Take the max with the intrinsic value
jEarly = j; // and record the largest early exercise index
}
}
for ( j = jMin; j < jMax; j++ ) // Copy the new values back into the old array
fOld[j] = fNew[j];
jMin++; // Move the boundaries in by one
jMax--;
}
// Copy the computed solution into the desired place
jMin--; // The last decrement and increment were mistakes
jMax++;
int i = 0; // The index into the output array
for ( j = jMin; j < jMax; j++ ) { // Now the range of j should match the output array
x = xMin + j*dx;
S = exp(x);
prices[i] = fOld[j];
intrinsic[i] = V[j];
spots[i++] = S; // Increment i after all copy operations
}
double xEarly = xMin + jEarly*dx;
*SEarly = exp(xEarly); // Pass back the computed early exercise boundary
delete fNew; // Be a good citizen and free the memory when you're done.
delete fOld;
delete V;
return;
}
int main() {
cout << "Hello " << endl;
ofstream csvFile; // The file for output, will be csv format for Excel.
csvFile.open ("PutPrice.csv");
double sigma = .3;
double r = .003;
double T = .5;
double K = 100;
double Smin = 60;
double Smax = 180;
double prices[NSPOTS];
double intrinsic[NSPOTS];
double spots[ NSPOTS];
double SEarly;
FE( T, sigma, r, K, Smin, Smax, NSPOTS, prices, intrinsic, spots, &SEarly );
for ( int j = 0; j < NSPOTS; j++ ) { // Write out the spot prices for plotting
csvFile << spots[j];
if ( j < (NSPOTS - 1) ) csvFile << ", "; // Don't put a comma after the last value
}
csvFile << endl;
for ( int j = 0; j < NSPOTS; j++ ) { // Write out the intrinsic prices for plotting
csvFile << intrinsic[j];
if ( j < (NSPOTS - 1) ) csvFile << ", "; // Don't put a comma after the last value
}
csvFile << endl;
for ( int j = 0; j < NSPOTS; j++ ) { // Write out the computed option prices
csvFile << prices[j];
if ( j < (NSPOTS - 1) ) csvFile << ", ";
}
csvFile << endl;
csvFile << "Critical price," << SEarly << endl;
csvFile << "T ," << T << endl;
csvFile << "r ," << r << endl;
csvFile << "sigma ," << sigma << endl;
csvFile << "strike ," << K << endl;
return 0 ;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ Advice on manipulating output Matrix data - c++

Well why can't you just create the transpose of your SDE_X matrix then? Isn't that what you want to get?

Related

Optimize c++ Monte Carlo simulation with long dynamic arrays

How to generate a n-sized random float array that sums up to 0.0?

Trying to compute e^x when x_0 = 1

Reading 2 CSV files and using vectors to store the values and then calculate the coefficient. Returns -1.#IND

Numerical solution to differential equations in C++, path to take?

Categories

Resources