Suppose we need to generate a very long harmonic signal, ideally infinitely long. At first glance, the solution seems trivial:
Sample1:
float t = 0;
while (runned)
{
float v = sinf(w * t);
t += dt;
}
Unfortunately, this is a non-working solution. For t >> dt due to limited float precision incorrect values will be obtained. Fortunately we can call to mind that sin(2*PI* n + x) = sin(x) where n - arbitrary integer value, therefore modifying the example is not difficult to get an "infinite" analog
Sample2:
float t = 0;
float tau = 2 * M_PI / w;
while (runned)
{
float v = sinf(w * t);
t += dt;
if (t > tau) t -= tau;
}
For one physical simulation, I needed to get an infinite signal, which is the sum of harmonic signals, like that:
Sample3:
float getSignal(float x)
{
float ret = 0;
for (int i = 0; i < modNum; i++)
ret += sin(w[i] * x);
return ret;
}
float t = 0;
while (runned)
{
float v = getSignal(t);
t += dt;
}
In this form, the code does not work correctly for large t, for similar reasons for the Sample1. The question is - how to get an "infinite" implementation of the Sample3 algorithm? I assume that the solution should looks like an Sample2. A very important note - generally speaking, w[i] is arbitrary and not harmonics, that is, all frequencies are not multiples of some base frequency, so i can't find common tau. Using types with greater precission (double, long double) is not allowed.
Thanks for your advice!
You can choose an arbitrary tau and store the phase reminders for each mod when subtracting it from t (as #Damien suggested in the comments).
Also, representing the time as t = dt * it where it is an integer can improve numerical stability (i think).
Maybe something like this:
int ndt = 1000; // accumulate phase every 1000 steps for example
float tau = dt * ndt;
std::vector<float> phases(modNum, 0.0f);
int it = 0;
float t = 0.0f;
while (runned)
{
t = dt * it;
float v = 0.0f;
for (int i = 0; i < modNum; i++)
{
v += sinf(w[i] * t + phases[i]);
}
if (++it >= ndt)
{
it = 0;
for (int i = 0; i < modNum; ++i)
{
phases[i] = fmod(w[i] * tau + phases[i], 2 * M_PI);
}
}
}
Related
I've been trying to write a function to approximate an the value of an integral using the Composite Simpson's Rule.
template <typename func_type>
double simp_rule(double a, double b, int n, func_type f){
int i = 1; double area = 0;
double n2 = n;
double h = (b-a)/(n2-1), x=a;
while(i <= n){
area = area + f(x)*pow(2,i%2 + 1)*h/3;
x+=h;
i++;
}
area -= (f(a) * h/3);
area -= (f(b) * h/3);
return area;
}
What I do is multiply each value of the function by either 2 or 4 (and h/3) with pow(2,i%2 + 1) and subtract off the edges as these should only have a weight of 1.
At first, I thought it worked just fine, however, when I compared it to my Trapezoidal Method function it was way more inaccurate which shouldn't be the case.
This is a simpler version of a code I previously wrote which had the same problem, I thought that if I cleaned it up a little the problem would go away, but alas. From another post, I get the idea that there's something going on with the types and the operations I'm doing on them which results in loss of precision, but I just don't see it.
Edit:
For completeness, I was running it for e^x from 1 to zero
\\function to be approximated
double f(double x){ double a = exp(x); return a; }
int main() {
int n = 11; //this method works best for odd values of n
double e = exp(1);
double exact = e-1; //value of integral of e^x from 0 to 1
cout << simp_rule(0,1,n,f) - exact;
The Simpson's Rule uses this approximation to estimate a definite integral:
Where
and
So that there are n + 1 equally spaced sample points xi.
In the posted code, the parameter n passed to the function appears to be the number of points where the function is sampled (while in the previous formula n is the number of intervals, that's not a problem).
The (constant) distance between the points is calculated correctly
double h = (b - a) / (n - 1);
The while loop used to sum the weighted contributes of all the points iterates from x = a up to a point with an ascissa close to b, but probably not exactly b, due to rounding errors. This implies that the last calculated value of f, f(x_n), may be slightly different from the expected f(b).
This is nothing, though, compared to the error caused by the fact that those end points are summed inside the loop with the starting weight of 4 and then subtracted after the loop with weight 1, while all the inner points have their weight switched. As a matter of fact, this is what the code calculates:
Also, using
pow(2, i%2 + 1)
To generate the sequence 4, 2, 4, 2, ..., 4 is a waste, in terms of efficency, and may add (depending on the implementation) other unnecessary rounding errors.
The following algorithm shows how to obtain the same (fixed) result, without a call to that library function.
template <typename func_type>
double simpson_rule(double a, double b,
int n, // Number of intervals
func_type f)
{
double h = (b - a) / n;
// Internal sample points, there should be n - 1 of them
double sum_odds = 0.0;
for (int i = 1; i < n; i += 2)
{
sum_odds += f(a + i * h);
}
double sum_evens = 0.0;
for (int i = 2; i < n; i += 2)
{
sum_evens += f(a + i * h);
}
return (f(a) + f(b) + 2 * sum_evens + 4 * sum_odds) * h / 3;
}
Note that this function requires the number of intervals (e.g. use 10 instead of 11 to obtain the same results of OP's function) to be passed, not the number of points.
Testable here.
The above excellent and accepted solution could benefit from liberal use of std::fma() and templatize on the floating point type.
https://en.cppreference.com/w/cpp/numeric/math/fma
#include <cmath>
template <typename fptype, typename func_type>
double simpson_rule(fptype a, fptype b,
int n, // Number of intervals
func_type f)
{
fptype h = (b - a) / n;
// Internal sample points, there should be n - 1 of them
fptype sum_odds = 0.0;
for (int i = 1; i < n; i += 2)
{
sum_odds += f(std::fma(i,h,a));
}
fptype sum_evens = 0.0;
for (int i = 2; i < n; i += 2)
{
sum_evens += f(std::fma(i,h,a);
}
return (std::fma(2,sum_evens,f(a)) +
std::fma(4,sum_odds,f(b))) * h / 3;
}
I have a program that solves generally for 1D brownian motion using an Euler's Method.
Being a stochastic process, I want to average it over many particles. But I find that as I ramp up the number of particles, it overloads and i get the std::badalloc error, which I understand is a memory error.
Here is my full code
#include <iostream>
#include <vector>
#include <fstream>
#include <cmath>
#include <cstdlib>
#include <limits>
#include <ctime>
using namespace std;
// Box-Muller Method to generate gaussian numbers
double generateGaussianNoise(double mu, double sigma) {
const double epsilon = std::numeric_limits<double>::min();
const double tau = 2.0 * 3.14159265358979323846;
static double z0, z1;
static bool generate;
generate = !generate;
if (!generate) return z1 * sigma + mu;
double u1, u2;
do {
u1 = rand() * (1.0 / RAND_MAX);
u2 = rand() * (1.0 / RAND_MAX);
} while (u1 <= epsilon);
z0 = sqrt(-2.0 * log(u1)) * cos(tau * u2);
z1 = sqrt(-2.0 * log(u1)) * sin(tau * u2);
return z0 * sigma + mu;
}
int main() {
// Initialize Variables
double gg; // Gaussian Number Picked from distribution
// Integrator
double t0 = 0; // Setting the Time Window
double tf = 10;
double n = 5000; // Number of Steps
double h = (tf - t0) / n; // Time Step Size
// Set Constants
const double pii = atan(1) * 4; // pi
const double eta = 1; // viscous constant
const double m = 1; // mass
const double aa = 1; // radius
const double Temp = 30; // Temperature in Kelvins
const double KB = 1; // Boltzmann Constant
const double alpha = (6 * pii * eta * aa);
// More Constants
const double mu = 0; // Gaussian Mean
const double sigma = 1; // Gaussian Std Deviation
const double ng = n; // No. of pts to generate for Gauss distribution
const double npart = 1000; // No. of Particles
// Initial Conditions
double x0 = 0;
double y0 = 0;
double t = t0;
// Vectors
vector<double> storX; // Vector that keeps displacement values
vector<double> storY; // Vector that keeps velocity values
vector<double> storT; // Vector to store time
vector<double> storeGaussian; // Vector to store Gaussian numbers generated
vector<double> holder; // Placeholder Vector for calculation operations
vector<double> mainstore; // Vector that holds the final value desired
storT.push_back(t0);
// Prepares mainstore
for (int z = 0; z < (n+1); z++) {
mainstore.push_back(0);
}
for (int NN = 0; NN < npart; NN++) {
holder.clear();
storX.clear();
storY.clear();
storT.clear();
storT.push_back(0);
// Prepares holder
for (int z = 0; z < (n+1); z++) {
holder.push_back(0);
storX.push_back(0);
storY.push_back(0);
}
// Gaussian Generator
srand(time(NULL));
for (double iiii = 0; iiii < ng; iiii++) {
gg = generateGaussianNoise(0, 1); // generateGaussianNoise(mu,sigma)
storeGaussian.push_back(gg);
}
// Solver
for (int ii = 0; ii < n; ii++) {
storY[ii + 1] =
storY[ii] - (alpha / m) * storY[ii] * h +
(sqrt(2 * alpha * KB * Temp) / m) * sqrt(h) * storeGaussian[ii];
storX[ii + 1] = storX[ii] + storY[ii] * h;
holder[ii + 1] =
pow(storX[ii + 1], 2); // Finds the displacement squared
t = t + h;
storT.push_back(t);
}
// Updates the Main Storage
for (int z = 0; z < storX.size(); z++) {
mainstore[z] = mainstore[z] + holder[z];
}
}
// Average over the number of particles
for (int z = 0; z < storX.size(); z++) {
mainstore[z] = mainstore[z] / (npart);
}
// Outputs the data
ofstream fout("LangevinEulerTest.txt");
for (int jj = 0; jj < storX.size(); jj++) {
fout << storT[jj] << '\t' << mainstore[jj] << '\t' << storX[jj] << endl;
}
return 0;
}
As you can see, npart is the variable that I change to vary the number of particles. But after each iteration, I do clear my storage vectors like storX,storY... So on paper, the number of particles should not affect memory? I am only just calling the compiler to repeat many more times, and add onto the main storage vector mainstore. I am running my code on a computer with 4GB ram.
Would greatly appreciate it if anyone could point out my errors in logic or suggest improvements.
Edit: Currently the number of particles is set to npart = 1000.
So when I try to ramp it up to like npart = 20000 or npart = 50000, it gives me memory errors.
Edit2 I've edited the code to allocate an extra index to each of the storage vectors. But it does not seem to fix the memory overflow
There is an out of bounds exception in the solver part. storY has size n and you access ii+1 where i goes up to n-1. So for your code provided. storY has size 5000. It is allowed to access with indices between 0 and 4999 (including) but you try to access with index 5000. The same for storX, holder and mainstore.
Also, storeGaussian does not get cleared before adding new variables. It grows by n for each npart loop. You access only the first n values of it in the solver part anyway.
Please note, that vector::clear removes all elements from the vector, but does not necessarily change the vector's capacity (i.e. it's storage array), see the documentation.
This won't cause the problem here, because you'll reuse the same array in the next runs, but it's something to be aware when using vectors.
I am converting equations to c++. Is this correct for a running standard deviation.
this->runningStandardDeviation = (this->sumOfProcessedSquaredSamples - sumSquaredDividedBySampleCount) / (sampleCount - 1);
Here is the full function:
void BM_Functions::standardDeviationForRunningSamples (float samples [], int sampleCount)
{
// update the running process samples count
this->totalSamplesProcessed += sampleCount;
// get the mean of the samples
double mean = meanForSamples(samples, sampleCount);
// sum the deviations
// sum the squared deviations
for (int i = 0; i < sampleCount; i++)
{
// update the deviation sum of processed samples
double deviation = samples[i] - mean;
this->sumOfProcessedSamples += deviation;
// update the squared deviations sum
double deviationSquared = deviation * deviation;
this->sumOfProcessedSquaredSamples += deviationSquared;
}
// get the sum squared
double sumSquared = this->sumOfProcessedSamples * this->sumOfProcessedSamples;
// get the sum/N
double sumSquaredDividedBySampleCount = sumSquared / this->totalSamplesProcessed;
this->runningStandardDeviation = sqrt((this->sumOfProcessedSquaredSamples - sumSquaredDividedBySampleCount) / (sampleCount - 1));
}
A numerically stable and efficient algorithm for computing the running mean and variance/SD is Welford's algorithm.
One C++ implementation would be:
std::pair<double,double> getMeanVariance(const std::vector<double>& vec) {
double mean = 0, M2 = 0, variance = 0;
size_t n = vec.size();
for(size_t i = 0; i < n; ++i) {
double delta = vec[i] - mean;
mean += delta / (i + 1);
M2 += delta * (vec[i] - mean);
variance = M2 / (i + 1);
if (i >= 2) {
// <-- You can use the running mean and variance here
}
}
return std::make_pair(mean, variance);
}
Note: to get the SD, just take sqrt(variance)
You may check for sufficient sampleSount (1 would cause division by zero)
MAke sure that the variables have suitable data type (floating point)
Otherwise this looks correct...
I am trying to calculate complex numbers for a 2D array in C++. The code is running very slowly and I have narrowed down the main cause to be the exp function (the program runs quickly when I comment out that line, even though I have 4 nested loops).
int main() {
typedef vector< complex<double> > complexVect;
typedef vector<double> doubleVect;
const int SIZE = 256;
vector<doubleVect> phi_w(SIZE, doubleVect(SIZE));
vector<complexVect> phi_k(SIZE, complexVect(SIZE));
complex<double> i (0, 1), cmplx (0, 0);
complex<double> temp;
int x, y, t, k, w;
double dk = 2.0*M_PI / (SIZE-1);
double dt = M_PI / (SIZE-1);
int xPos, yPos;
double arg, arg2, arg4;
complex<double> arg3;
double angle;
vector<complexVect> newImg(SIZE, complexVect(SIZE));
for (x = 0; x < SIZE; ++x) {
xPos = -127 + x;
for (y = 0; y < SIZE; ++y) {
yPos = -127 + y;
for (t = 0; t < SIZE; ++t) {
temp = cmplx;
angle = dt * t;
arg = xPos * cos(angle) + yPos * sin(angle);
for (k = 0; k < SIZE; ++k) {
arg2 = -M_PI + dk*k;
arg3 = exp(-i * arg * arg2);
arg4 = abs(arg) * M_PI / (abs(arg) + M_PI);
temp = temp + arg4 * arg3 * phi_k[k][t];
}
}
newImg[y][x] = temp;
}
}
}
Is there a way I can improve computation time? I have tried using the following helper function but it doesn't noticeably help.
complex<double> complexexp(double arg) {
complex<double> temp (sin(arg), cos(arg));
return temp;
}
I am using clang++ to compile my code
edit: I think the problem is the fact that I'm trying to calculate complex numbers. Would it be faster if I just used Euler's formula to calculate the real and imaginary parts in separate arrays and not have to deal with the complex class?
maybe this will work for you:
http://martin.ankerl.com/2007/02/11/optimized-exponential-functions-for-java/
I've had a look with callgrind. The only marginal improvement (~1.3% with size = 50) I could find was to change:
temp = temp + arg4 * arg3 * phi_k[k][t];
to
temp += arg4 * arg3 * phi_k[k][t];
The most costly function calls were sin()/cos(). I suspect that calling exp() with a complex number argument calls those functions in the background.
To retain precision, the function will compute very slowly and there doesn't seem to be a way around it. However, you could trade precision for accuracy, which seems to be what game developers would do: sin and cos are slow, is there an alternatve?
You can define number e as a constant and use std::pow() function
I'm writing a sine function that has to be recursive. I have written a sine function but am not really sure how to do it recursively. Could someone explain how to get started on this?
This is what I have so far:
/*--------------------------------------------------------------
Name: sine( double X );
Return: Function "sine" will return the
sine of X, where X is measured in radians.
--------------------------------------------------------------*/
double sine(double X)
{
double result = 0;
double term;
int k;
double lim;
k = 0;
lim = power(10, -8);
term = power(-1, k)*power(X, ((2*k) + 1)) / (factorial((2*k)+1));
result = term;
while (absolute(term) > lim)
{
k += 1;
term = power(-1, k)*power(X, ((2*k) + 1)) / (factorial((2*k)+1));
result += term;
}
return result;
}
EDIT: I used a wrapper function to solve this. Basically created another function called
double sine_rec(double X, double k)
and changed around the current code to fit in with that.
The way I would approach this would be to have another function sine(double X, int n) which takes another integer parameter - the number of terms to include in the power series approximation. Then this function could return something like [nth term in series] + sine(X, n - 1) (just remember a prior if statement to deal with n = 1).
You can eliminate the while loop by recursion in following way:
double sine(double X, int k = 0)
{
double result = 0;
double term;
double lim;
lim = power(10, -8);
term = power(-1, k)*power(X, ((2*k) + 1)) / (factorial((2*k)+1));
if (absolute(term) > lim)
{
return sine(X, k+1) + term;
}
else
{
return term;
}
}
But I cannot recommend doing this at all. (There are better solutions even to this recursion, but find them on your own)