I am converting equations to c++. Is this correct for a running standard deviation.
this->runningStandardDeviation = (this->sumOfProcessedSquaredSamples - sumSquaredDividedBySampleCount) / (sampleCount - 1);
Here is the full function:
void BM_Functions::standardDeviationForRunningSamples (float samples [], int sampleCount)
{
// update the running process samples count
this->totalSamplesProcessed += sampleCount;
// get the mean of the samples
double mean = meanForSamples(samples, sampleCount);
// sum the deviations
// sum the squared deviations
for (int i = 0; i < sampleCount; i++)
{
// update the deviation sum of processed samples
double deviation = samples[i] - mean;
this->sumOfProcessedSamples += deviation;
// update the squared deviations sum
double deviationSquared = deviation * deviation;
this->sumOfProcessedSquaredSamples += deviationSquared;
}
// get the sum squared
double sumSquared = this->sumOfProcessedSamples * this->sumOfProcessedSamples;
// get the sum/N
double sumSquaredDividedBySampleCount = sumSquared / this->totalSamplesProcessed;
this->runningStandardDeviation = sqrt((this->sumOfProcessedSquaredSamples - sumSquaredDividedBySampleCount) / (sampleCount - 1));
}
A numerically stable and efficient algorithm for computing the running mean and variance/SD is Welford's algorithm.
One C++ implementation would be:
std::pair<double,double> getMeanVariance(const std::vector<double>& vec) {
double mean = 0, M2 = 0, variance = 0;
size_t n = vec.size();
for(size_t i = 0; i < n; ++i) {
double delta = vec[i] - mean;
mean += delta / (i + 1);
M2 += delta * (vec[i] - mean);
variance = M2 / (i + 1);
if (i >= 2) {
// <-- You can use the running mean and variance here
}
}
return std::make_pair(mean, variance);
}
Note: to get the SD, just take sqrt(variance)
You may check for sufficient sampleSount (1 would cause division by zero)
MAke sure that the variables have suitable data type (floating point)
Otherwise this looks correct...
Related
I'm trying to setup a pipeline allowing me to detect musical notes from audio samples, but the input layer where I identify the frequency content of the samples does not land on the expected values. In the example below I...
build what I expect to be a 440Hz (A4) sine wave in the FFTW input buffer
apply the Hamming window function
lookup the first half the output bins to find the 4 top values and their frequency
void GenerateSinWave(fftw_complex* outputArray, int N, double frequency, double samplingRate)
{
double sampleDurationSeconds = 1.0 / samplingRate;
for (int i = 0; i < N; ++i)
{
double sampleTime = i * sampleDurationSeconds;
outputArray[i][0] = sin(M_2_PI * frequency * sampleTime);
}
}
void HammingWindow(fftw_complex* array, int N)
{
static const double a0 = 25.0 / 46.0;
static const double a1 = 1 - a0;
for (int i = 0; i < N; ++i)
array[i][0] *= a0 - a1 * cos((M_2_PI * i) / N);
}
int main()
{
const int N = 4096;
double samplingRate = 44100;
double A4Frequency = 440;
fftw_complex in[N] = { 0 };
fftw_complex out[N] = { 0 };
fftw_plan plan = fftw_plan_dft_1d(N, 0, 0, FFTW_FORWARD, FFTW_ESTIMATE);
GenerateSinWave(in, N, A4Frequency, samplingRate);
HammingWindow(in, N);
fftw_execute_dft(plan, in, out);
// Find the 4 top values
double binHzRange = samplingRate / N;
for (int i = 0; i < 4; ++i)
{
double maxValue = 0;
int maxBin = 0;
for (int bin = 0; bin < (N/2); ++bin)
{
if (out[bin][0] > maxValue)
{
maxValue = out[bin][0];
maxBin = bin;
}
}
out[maxBin][0] = 0; // remove value for next pass
double binMidFreq = (maxBin * binHzRange) + (binHzRange / 2);
std::cout << (i + 1) << " -> Freq: " << binMidFreq << " Hz - Value: " << maxValue << "\n";
}
fftw_destroy_plan(plan);
}
I was expecting something close to 440 or lower/higher harmonics, however the results are far from that:
1 -> Freq: 48.4497Hz - Value: 110.263
2 -> Freq: 59.2163Hz - Value: 19.2777
3 -> Freq: 69.9829Hz - Value: 5.68717
4 -> Freq: 80.7495Hz - Value: 2.97571
This flow is mostly inspired by this other SO answer. I feel that my lack of knowledge about signal processing might be in cause! My sin wave generation and window function seem to be ok, but audio analysis and FFTW are full of mysteries...
Any insight about how to improve my usage of FFTW, approach signal processing or simply write better code is appreciated!
EDIT: fixed integer division leading to Hamming a0 parameter always being 0. Results changed a little, but still far of the expected 440 Hz
I think you've misunderstood the M_2_PI constant in your GenerateSinWave function. M_2_PI is defined as 2.0 / PI.
You should be using 2 * M_PI instead.
This mistake will mean that your generated signal has a frequency of only around 45 Hz. This should be close to the output frequencies you are seeing.
The same constant needs correcting in your HammingWindow function too.
Suppose we need to generate a very long harmonic signal, ideally infinitely long. At first glance, the solution seems trivial:
Sample1:
float t = 0;
while (runned)
{
float v = sinf(w * t);
t += dt;
}
Unfortunately, this is a non-working solution. For t >> dt due to limited float precision incorrect values will be obtained. Fortunately we can call to mind that sin(2*PI* n + x) = sin(x) where n - arbitrary integer value, therefore modifying the example is not difficult to get an "infinite" analog
Sample2:
float t = 0;
float tau = 2 * M_PI / w;
while (runned)
{
float v = sinf(w * t);
t += dt;
if (t > tau) t -= tau;
}
For one physical simulation, I needed to get an infinite signal, which is the sum of harmonic signals, like that:
Sample3:
float getSignal(float x)
{
float ret = 0;
for (int i = 0; i < modNum; i++)
ret += sin(w[i] * x);
return ret;
}
float t = 0;
while (runned)
{
float v = getSignal(t);
t += dt;
}
In this form, the code does not work correctly for large t, for similar reasons for the Sample1. The question is - how to get an "infinite" implementation of the Sample3 algorithm? I assume that the solution should looks like an Sample2. A very important note - generally speaking, w[i] is arbitrary and not harmonics, that is, all frequencies are not multiples of some base frequency, so i can't find common tau. Using types with greater precission (double, long double) is not allowed.
Thanks for your advice!
You can choose an arbitrary tau and store the phase reminders for each mod when subtracting it from t (as #Damien suggested in the comments).
Also, representing the time as t = dt * it where it is an integer can improve numerical stability (i think).
Maybe something like this:
int ndt = 1000; // accumulate phase every 1000 steps for example
float tau = dt * ndt;
std::vector<float> phases(modNum, 0.0f);
int it = 0;
float t = 0.0f;
while (runned)
{
t = dt * it;
float v = 0.0f;
for (int i = 0; i < modNum; i++)
{
v += sinf(w[i] * t + phases[i]);
}
if (++it >= ndt)
{
it = 0;
for (int i = 0; i < modNum; ++i)
{
phases[i] = fmod(w[i] * tau + phases[i], 2 * M_PI);
}
}
}
Consider the following C++ function in R using Rcpp:
cppFunction('long double statZn_cpp(NumericVector dat, double kn) {
double n = dat.size();
// Get total sum and sum of squares; this will be the "upper sum"
// (i.e. the sum above k)
long double s_upper, s_square_upper;
// The "lower sums" (i.e. those below k)
long double s_lower, s_square_lower;
// Get lower sums
// Go to kn - 1 to prevent double-counting in main
// loop
for (int i = 0; i < kn - 1; ++i) {
s_lower += dat[i];
s_square_lower += dat[i] * dat[i];
}
// Get upper sum
for (int i = kn - 1; i < n; ++i) {
s_upper += dat[i];
s_square_upper += dat[i] * dat[i];
}
// The maximum, which will be returned
long double M = 0;
// A candidate for the new maximum, used in a loop
long double M_candidate;
// Compute the test statistic
for (int k = kn; k <= (n - kn); ++k) {
// Update s and s_square for both lower and upper
s_lower += dat[k-1];
s_square_lower += dat[k-1] * dat[k-1];
s_upper -= dat[k-1];
s_square_upper -= dat[k-1] * dat[k-1];
// Get estimate of sd for this k
long double sdk = sqrt((s_square_lower - pow(s_lower, 2.0) / k +
s_square_upper -
pow(s_upper, 2.0) / (n - k))/n);
M_candidate = abs(s_lower / k - s_upper / (n - k)) / sdk;
// Choose new maximum
if (M_candidate > M) {
M = M_candidate;
}
}
return M * sqrt(kn);
}')
Try the command statZn_cpp(1:20,4), and you will get 6.963106, which is the correct answer. Scaling should not matter; statZn_cpp(1:20*10,4) will also yield the correct answer of 6.963106. But statZn_cpp(1:20/10,4) yields the wrong answer of 6.575959, and statZn_cpp(1:20/100,4) again gives you the obviously wrong answer of 0. More to the point (and relevant to my research, which involves simulation studies), when I try statZn_cpp(rnorm(20),4), the answer is almost always 0, which is wrong.
Clearly the problem has to do with rounding errors, but I don't know where they are or how to fix them (I am brand new to C++). I've tried to expand precision as much as possible. Is there a way to fix the rounding problem? (An R wrapper function is permissible if I should be attempting what amounts to a preprocessing step, but it needs to be robust, working for general levels of precision.)
EDIT: Here is some "equivalent" R code:
statZn <- function(dat, kn = function(n) {floor(sqrt(n))}) {
n = length(dat)
return(sqrt(kn(n))*max(sapply(
floor(kn(n)):(n - floor(kn(n))), function(k)
abs(1/k*sum(dat[1:k]) -
1/(n-k)*sum(dat[(k+1):n]))/sqrt((sum((dat[1:k] -
mean(dat[1:k]))^2)+sum((dat[(k+1):n] -
mean(dat[(k+1):n]))^2))/n))))
}
Also, the R code below basically replicates the method that should be used by the C++ code. It is capable of reaching the correct answer.
n = length(dat)
s_lower = 0
s_square_lower = 0
s_upper = 0
s_square_upper = 0
for (i in 1:(kn-1)) {
s_lower = s_lower + dat[i]
s_square_lower = s_square_lower + dat[i] * dat[i]
}
for (i in kn:n) {
s_upper = s_upper + dat[i]
s_square_upper = s_square_upper + dat[i] * dat[i]
}
M = 0
for (k in kn:(n-kn)) {
s_lower = s_lower + dat[k]
s_square_lower = s_square_lower + dat[k] * dat[k]
s_upper = s_upper - dat[k]
s_square_upper = s_square_upper - dat[k] * dat[k]
sdk = sqrt((s_square_lower - (s_lower)^2/k +
s_square_upper -
(s_upper)^2/(n-k))/n)
M_candidate = sqrt(kn) * abs(s_lower / k - s_upper / (n - k)) / sdk
cat('k', k, '\n',
"s_lower", s_lower, '\n',
's_square_lower', s_square_lower, '\n',
's_upper', s_upper, '\n',
's_square_upper', s_square_upper, '\n',
'sdk', sdk, '\n',
'M_candidate', M_candidate, '\n\n')
if (M_candidate > M) {
M = M_candidate
}
}
1: You should not be using long double, since R represents all numeric values in the double type. Using a more precise type for intermediate calculations is extremely unlikely to provide any benefit, and is more likely to result in strange inconsistencies between platforms.
2: You're not initializing s_upper, s_square_upper, s_lower, and s_square_lower. (You actually are initializing them in the R implementation, but you forgot in the C++ implementation.)
3: Minor point, but I would replace the pow(x,2.0) calls with x*x. Although this doesn't really matter.
4: This is what fixed it for me: You need to qualify calls to C++ standard library functions with their containing namespace. IOW, std::sqrt() instead of just sqrt(), std::abs() instead of just abs(), and std::pow() instead of just pow() if you continue to use it.
cppFunction('double statZn_cpp(NumericVector dat, double kn) {
int n = dat.size();
double s_upper = 0, s_square_upper = 0; // Get total sum and sum of squares; this will be the "upper sum" (i.e. the sum above k)
double s_lower = 0, s_square_lower = 0; // The "lower sums" (i.e. those below k)
for (int i = 0; i < kn - 1; ++i) { s_lower += dat[i]; s_square_lower += dat[i] * dat[i]; } // Get lower sums; Go to kn - 1 to prevent double-counting in main
for (int i = kn - 1; i < n; ++i) { s_upper += dat[i]; s_square_upper += dat[i] * dat[i]; } // Get upper sum
double M = 0; // The maximum, which will be returned
double M_candidate; // A candidate for the new maximum, used in a loop
// Compute the test statistic
for (int k = kn; k <= (n - kn); ++k) {
// Update s and s_square for both lower and upper
s_lower += dat[k-1];
s_square_lower += dat[k-1] * dat[k-1];
s_upper -= dat[k-1];
s_square_upper -= dat[k-1] * dat[k-1];
// Get estimate of sd for this k
double sdk = std::sqrt((s_square_lower - s_lower*s_lower / k + s_square_upper - s_upper*s_upper / (n - k))/n);
M_candidate = std::abs(s_lower / k - s_upper / (n - k)) / sdk;
if (M_candidate > M) M = M_candidate; // Choose new maximum
}
return std::sqrt(kn) * M;
}');
statZn_cpp(1:20,4); ## you will get 6.963106, which is the correct answer
## [1] 6.963106
statZn_cpp(1:20*10,4); ## Scaling should not matter; will also yield the correct answer of 6.963106
## [1] 6.963106
statZn_cpp(1:20/10,4); ## yields the wrong answer of 6.575959
## [1] 6.963106
statZn_cpp(1:20/100,4); ## again gives you the obviously wrong answer of 0.
## [1] 6.963106
set.seed(1L); statZn_cpp(rnorm(20),4); ## More to the point (and relevant to my research, which involves simulation studies), the answer is almost always 0, which is wrong.
## [1] 1.270117
I have a program that solves generally for 1D brownian motion using an Euler's Method.
Being a stochastic process, I want to average it over many particles. But I find that as I ramp up the number of particles, it overloads and i get the std::badalloc error, which I understand is a memory error.
Here is my full code
#include <iostream>
#include <vector>
#include <fstream>
#include <cmath>
#include <cstdlib>
#include <limits>
#include <ctime>
using namespace std;
// Box-Muller Method to generate gaussian numbers
double generateGaussianNoise(double mu, double sigma) {
const double epsilon = std::numeric_limits<double>::min();
const double tau = 2.0 * 3.14159265358979323846;
static double z0, z1;
static bool generate;
generate = !generate;
if (!generate) return z1 * sigma + mu;
double u1, u2;
do {
u1 = rand() * (1.0 / RAND_MAX);
u2 = rand() * (1.0 / RAND_MAX);
} while (u1 <= epsilon);
z0 = sqrt(-2.0 * log(u1)) * cos(tau * u2);
z1 = sqrt(-2.0 * log(u1)) * sin(tau * u2);
return z0 * sigma + mu;
}
int main() {
// Initialize Variables
double gg; // Gaussian Number Picked from distribution
// Integrator
double t0 = 0; // Setting the Time Window
double tf = 10;
double n = 5000; // Number of Steps
double h = (tf - t0) / n; // Time Step Size
// Set Constants
const double pii = atan(1) * 4; // pi
const double eta = 1; // viscous constant
const double m = 1; // mass
const double aa = 1; // radius
const double Temp = 30; // Temperature in Kelvins
const double KB = 1; // Boltzmann Constant
const double alpha = (6 * pii * eta * aa);
// More Constants
const double mu = 0; // Gaussian Mean
const double sigma = 1; // Gaussian Std Deviation
const double ng = n; // No. of pts to generate for Gauss distribution
const double npart = 1000; // No. of Particles
// Initial Conditions
double x0 = 0;
double y0 = 0;
double t = t0;
// Vectors
vector<double> storX; // Vector that keeps displacement values
vector<double> storY; // Vector that keeps velocity values
vector<double> storT; // Vector to store time
vector<double> storeGaussian; // Vector to store Gaussian numbers generated
vector<double> holder; // Placeholder Vector for calculation operations
vector<double> mainstore; // Vector that holds the final value desired
storT.push_back(t0);
// Prepares mainstore
for (int z = 0; z < (n+1); z++) {
mainstore.push_back(0);
}
for (int NN = 0; NN < npart; NN++) {
holder.clear();
storX.clear();
storY.clear();
storT.clear();
storT.push_back(0);
// Prepares holder
for (int z = 0; z < (n+1); z++) {
holder.push_back(0);
storX.push_back(0);
storY.push_back(0);
}
// Gaussian Generator
srand(time(NULL));
for (double iiii = 0; iiii < ng; iiii++) {
gg = generateGaussianNoise(0, 1); // generateGaussianNoise(mu,sigma)
storeGaussian.push_back(gg);
}
// Solver
for (int ii = 0; ii < n; ii++) {
storY[ii + 1] =
storY[ii] - (alpha / m) * storY[ii] * h +
(sqrt(2 * alpha * KB * Temp) / m) * sqrt(h) * storeGaussian[ii];
storX[ii + 1] = storX[ii] + storY[ii] * h;
holder[ii + 1] =
pow(storX[ii + 1], 2); // Finds the displacement squared
t = t + h;
storT.push_back(t);
}
// Updates the Main Storage
for (int z = 0; z < storX.size(); z++) {
mainstore[z] = mainstore[z] + holder[z];
}
}
// Average over the number of particles
for (int z = 0; z < storX.size(); z++) {
mainstore[z] = mainstore[z] / (npart);
}
// Outputs the data
ofstream fout("LangevinEulerTest.txt");
for (int jj = 0; jj < storX.size(); jj++) {
fout << storT[jj] << '\t' << mainstore[jj] << '\t' << storX[jj] << endl;
}
return 0;
}
As you can see, npart is the variable that I change to vary the number of particles. But after each iteration, I do clear my storage vectors like storX,storY... So on paper, the number of particles should not affect memory? I am only just calling the compiler to repeat many more times, and add onto the main storage vector mainstore. I am running my code on a computer with 4GB ram.
Would greatly appreciate it if anyone could point out my errors in logic or suggest improvements.
Edit: Currently the number of particles is set to npart = 1000.
So when I try to ramp it up to like npart = 20000 or npart = 50000, it gives me memory errors.
Edit2 I've edited the code to allocate an extra index to each of the storage vectors. But it does not seem to fix the memory overflow
There is an out of bounds exception in the solver part. storY has size n and you access ii+1 where i goes up to n-1. So for your code provided. storY has size 5000. It is allowed to access with indices between 0 and 4999 (including) but you try to access with index 5000. The same for storX, holder and mainstore.
Also, storeGaussian does not get cleared before adding new variables. It grows by n for each npart loop. You access only the first n values of it in the solver part anyway.
Please note, that vector::clear removes all elements from the vector, but does not necessarily change the vector's capacity (i.e. it's storage array), see the documentation.
This won't cause the problem here, because you'll reuse the same array in the next runs, but it's something to be aware when using vectors.
I have a list of doubles in the range of anywhere between -1.396655 to 1.74707 could even be higher or lower, either way I would know what the Min and Max value is before normalizing. My question is How can I normalize these values between -1 to 1 or even better yet convert them from double values to char values of 0 to 255
Any help would be appreciated.
double range = (double)(max - min);
value = 255 * (value - min)/range
You need a mapping of the form y = mx + c, and you need to find an m and a c. You have two fixed data-points, i.e.:
1 = m * max + c
-1 = m * min + c
From there, it's simple algebra.
The easiest thing is to first shift all the values so that min is 0, by subtracting Min from each number. Then multiply by 255/(Max-Min), so that the shifted Max will get mapped to 255, and everything else will scale linearly. So I believe your equation would look like this:
newval = (unsigned char) ((oldval - Min)*(255/(Max-Min)))
You may want to round a bit more carefully before casting to char.
There are two changes to be made.
First, use 256 as the limit.
Second, make sure your range is scaled back slightly to avoid getting 256.
public int GetRangedValue(double value, double min, double max)
{
int outputLimit = 256;
double range = (max - min) - double.Epsilon; // Here we shorten the range slightly
// Then we build a range such that value >= 0 and value < 1
double rangedValue = (value - min) / range;
return min + (int)(outputLimit * rangedValue);
}
With these two changes, you will get the correct distribution in your output.
I solved this need when I dived into doing some convolution stuff using C++.
Hopefully my code can have you a useful reference :)
bool normalize(uint8_t*& dst, double* src, int width, int height) {
dst = new uint8_t[sizeof(uint8_t)*width*height];
if (dst == NULL)
return false;
memset(dst, 0, sizeof(uint8_t)*width*height);
double max = std::numeric_limits<double>::min();
double min = std::numeric_limits<double>::max();
double range = std::numeric_limits<double>::max();
double norm = 0.0;
//find the boundary
for (int j=0; j<height; j++) {
for (int i=0; i<width; i++) {
if (src[i+j*width] > max)
max = src[i+j*width];
else if (src[i+j*width] < min)
min = src[i+j*width];
}
}
//normalize double matrix to be an uint8_t matrix
range = max - min;
for (int j=0; j<height; j++) {
for (int i=0; i<width; i++) {
norm = src[i+j*width];
norm = 255.0*(norm-min)/range;
dst[i+j*width] = (uint8_t)norm;
}
}
return true;
}
Basically output (calley receives by 'dst') is around [0, 255].