Marsaglia Normal Random Variables in C++

Marsaglia Normal Random Variables in C++ - c++

My query is pretty simple. I prefer to code numerical methods in Java but often need to do some things in C++. I like the Gaussian random variable in Java since it uses the Marsaglia algorithm AND keeps both Normal random variables. It returns one on the first call, the second on the second call, and does not do the expensive calculations again until the third call. Using the oracle link below (in program comments) I tried to implement this code in C++ but don't know how to write the C++ version of the "Synchronized" Public Method that will allow me to make use of both Normal random variables. I am not a professional programmer, so any guidance would be greatly appreciated.
In short I would like to keep:
v2*multiplier
// This function is Similar to the GNU
// Java Implementation as seen on
// http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Random.html#nextGaussian%28%29
double nextGaussian() {
double v1, v2, s, nextNextGaussian;
do {
v1 = 2 * nextUniform() - 1; // between -1.0 and 1.0
v2 = 2 * nextUniform() - 1; // between -1.0 and 1.0
s = v1 * v1 + v2 * v2;
} while (s >= 1 || s == 0);
double multiplier = sqrt(-2 * log(s)/s);
nextNextGaussian = v2 * multiplier;
return v1 * multiplier;
}

Just declare nextGaussianVal as static, i.e.
static double nextGaussianVal;
Then the value of nextGaussianVal will be available the next time the method is called. You might also need another static variable to keep up with the current count, like so:
double nextGaussian()
{
static int count = 0;
static double nextGaussianVal;
double firstGaussianVal, v1, v2, s;
if (count == 0) {
do {
v1 = 2 * nextUniform() - 1; // between -1.0 and 1.0
v2 = 2 * nextUniform() - 1; // between -1.0 and 1.0
s = v1 * v1 + v2 * v2;
} while (s >= 1 || s == 0);
double multiplier = sqrt(-2 * log(s)/s);
nextGaussianVal = v2 * multiplier;
firstGaussianVal = v1 * multiplier;
count = 1;
return firstGaussianVal;
}
count = 0;
return nextGaussianVal;
}
Edit: A more detailed explanation -- the first time the function is called, count is initialized to zero. Based on the if statement, the calculation in question is performed, and it is assumed that firstGaussianVal and nextGaussianVal are assigned values, count is assigned a value of one, and firstGaussianVal is returned. The next time the function is called, count will have its previously assigned value of one, and nextGaussianVal will contain the value it was previously assigned during the first call -- that being said, since count is now one, the function will, based on the if statement, assign zero to count and return nextGaussianVal. Rinse, repeat...

In a more object-oriented matter you should keep that stuff in a "random-number-generator" object. Look for instance at this code:
https://code.cor-lab.org/projects/nemomath/repository/entry/trunk/nemomath/src/nemo/Random.h
The class "gaussian" implemented what you want to have in the algorithmic way you it to be.

The problem with the above suggestion is that the COUNT variable should be a BOOLEAN and not an integer. In addition the stored Gaussian needs to also be STATIC.
I want to thank everybody for their help in helping me arrive to the correct solution I desired. I am aware that this snippet belongs in an object. I have my reasons for keeping it in one file right now.
double nextGaussian() {
// Static variables allow the function to make use of
// both Gaussian Random variables. Generated by the polar Marsaglia
// method This makes the function much more efficient with will
// pay off for simulations
static bool hasNextGaussian = false;
static double nextNextGaussian;
double v1, v2, s;
if (!hasNextGaussian) {
do {
v1 = 2 * nextUniform() - 1; // between -1.0 and 1.0
v2 = 2 * nextUniform() - 1; // between -1.0 and 1.0
s = v1 * v1 + v2 * v2;
} while (s >= 1 || s == 0);
double multiplier = sqrt(-2 * log(s)/s);
nextNextGaussian = v2 * multiplier;
hasNextGaussian = true;
return v1 * multiplier;
} else {
hasNextGaussian = false;
return nextNextGaussian;
}
}
double nextUniform() {
double Uniform = rand() / double(RAND_MAX);
return Uniform;
}

Related

Quadratic Algebra advice for Array like return function

I have a problem. I want to write a method, which uses the PQ-Formula to calculate Zeros on quadratic algebra.
As I see C++ doesn't support Arrays, unlike C#, which I use normally.
How do I get either, ZERO, 1 or 2 results returned?
Is there any other way without Array, which doesn't exists?
Actually I am not into pointers so my actual code is corrupted.
I'd glad if someone can help me.
float* calculateZeros(float p, float q)
{
float *x1, *x2;
if (((p) / 2)*((p) / 2) - (q) < 0)
throw std::exception("No Zeros!");
x1 *= -((p) / 2) + sqrt(static_cast<double>(((p) / 2)*((p) / 2) - (q)));
x2 *= -((p) / 2) - sqrt(static_cast<double>(((p) / 2)*((p) / 2) - (q)));
float returnValue[1];
returnValue[0] = x1;
returnValue[1] = x2;
return x1 != x2 ? returnValue[0] : x1;
}
Actualy this code is not compilable but this is the code I've done so far.

There are quite a fiew issues with; at very first, I'll be dropping all those totally needless parentheses, they just make the code (much) harder to read:
float* calculateZeros(float p, float q)
{
float *x1, *x2; // pointers are never initialized!!!
if ((p / 2)*(p / 2) - q < 0)
throw std::exception("No Zeros!"); // zeros? q just needs to be large enough!
x1 *= -(p / 2) + sqrt(static_cast<double>((p / 2)*(p / 2) - q);
x2 *= -(p / 2) - sqrt(static_cast<double>((p / 2)*(p / 2) - q);
// ^ this would multiply the pointer values! but these are not initialized -> UB!!!
float returnValue[1];
returnValue[0] = x1; // you are assigning pointer to value here
returnValue[1] = x2;
return x1 != x2 ? returnValue[0] : x1;
// ^ value! ^ pointer!
// apart from, if you returned a pointer to returnValue array, then you would
// return a pointer to data with scope local to the function – i. e. the array
// is destroyed upon leaving the function, thus the pointer returned will get
// INVALID as soon as the function is exited; using it would again result in UB!
}
As is, your code wouldn't even compile...
As I see C++ doesn't support arrays
Well... I assume you meant: 'arrays as return values or function parameters'. That's true for raw arrays, these can only be passed as pointers. But you can accept structs and classes as parameters or use them as return values. You want to return both calculated values? So you could use e. g. std::array<float, 2>; std::array is a wrapper around raw arrays avoiding all the hassle you have with the latter... As there are exactly two values, you could use std::pair<float, float>, too, or std::tuple<float, float>.
Want to be able to return either 2, 1 or 0 values? std::vector<float> might be your choice...
std::vector<float> calculateZeros(float p, float q)
{
std::vector<float> results;
// don't repeat the code all the time...
double h = static_cast<double>(p) / 2; // "half"
s = h * h; // "square" (of half)
if(/* s greater than or equal q */)
{
// only enter, if we CAN have a result otherwise, the vector remains empty
// this is far better behaviour than the exception
double r = sqrt(s - q); // "root"
h = -h;
if(/* r equals 0*/)
{
results.push_back(h);
}
else
{
results.reserve(2); // prevents re-allocations;
// admitted, for just two values, we could live with...
results.push_back(h + r);
results.push_back(h - r);
}
}
return results;
}
Now there's one final issue left: as precision even of double is limited, rounding errors can occur (and the matter is even worth if using float; I would recommend making all floats to doubles, parameters and return values as well!). You shouldn't ever compare for exact equality (someValue == 0.0), but consider some epsilon to cover badly rounded values:
-epsilon < someValue && someValue < +epsilon
Ok, in given case, there are two originally exact comparisons involved, we might want to do as little epsilon-comparisons as possible. So:
double d = r - s;
if(d > -epsilon)
{
// considered 0 or greater than
h = -h;
if(d < +epsilon)
{
// considered 0 (and then no need to calculate the root at all...)
results.push_back(h);
}
else
{
// considered greater 0
double r = sqrt(d);
results.push_back(h - r);
results.push_back(h + r);
}
}
Value of epsilon? Well, either use a fix, small enough value or calculate it dynamically based on the smaller of the two values (multiply some small factor to) – and be sure to have it positive... You might be interested in a bit more of information on the matter. You don't have to care about not being C++ – the issue is the same for all languages using IEEE754 representation for doubles.

How to do logarithmic binning on a histogram?

I'm looking for a technique to logarithmically bin some data sets. We've got data with values ranging from _min to _max (floats >= 0) and the user needs to be able to specify a varying number of bins _num_bins (some int n).
I've implemented a solution taken from this question and some help on scaling here but my solution stops working when my data values lie below 1.0.
class Histogram {
double _min, _max;
int _num_bins;
......
};
double Histogram::logarithmicValueOfBin(double in) const {
if (in == 0.0)
return _min;
double b = std::log(_max / _min) / (_max - _min);
double a = _max / std::exp(b * _max);
double in_unscaled = in * (_max - _min) / _num_bins + _min;
return a * std::exp(b * in_unscaled) ;
}
When the data values are all greater than 1 I get nicely sized bins and can plot properly. When the values are less than 1 the bins come out more or less the same size and we get way too many of them.

I found a solution by reimplementing an opensource version of Matlab's logspace function.
Given a range and a number of bins you need to create an evenly spaced numerical sequence
module.exports = function linspace(a,b,n) {
var every = (b-a)/(n-1),
ranged = integers(a,b,every);
return ranged.length == n ? ranged : ranged.concat(b);
}
After that you need to loop through each value and with your base (e, 2 or 10 most likely) store the power and you get your bin ranges.
module.exports.logspace = function logspace(a,b,n) {
return linspace(a,b,n).map(function(x) { return Math.pow(10,x); });
}
I rewrote this in C++ and it's able to support ranges > 0.

You can do something like the following
// Create isolethargic binning
int T_MIN = 0; //The lower limit i.e. 1.e0
int T_MAX = 8; //The uper limit i.e. 1.e8
int ndec = T_MAX - T_MIN; //Number of decades
int N_BPDEC = 1000; //Number of bins per decade
int nbins = (int) ndec*N_BPDEC; //Total number of bins
double step = (double) ndec / nbins;//The increment
double tbins[nbins+1]; //The array to store the bins
for(int i=0; i <= nbins; ++i)
tbins[i] = (float) pow(10., step * (double) i + T_MIN);

Boids colliding with each other

I was looking at some pseudocode for boids and wrote it in C++. However, I am finding that boids will occasionally collide with each other. I thought that I had programmed it correctly, given how simple the psuedocode is. yet, when i display the locations of all the boids, some of them have the same coordinates.
The pseudocode from the link:
PROCEDURE rule2(boid bJ)
Vector c = 0;
FOR EACH BOID b
IF b != bJ THEN
IF |b.position - bJ.position| < 100 THEN
c = c - (b.position - bJ.position)
END IF
END IF
END
RETURN c
END PROCEDURE
my code is:
std::pair <signed int, signed int> keep_distance(std::vector <Boid> & boids, Boid & boid){
signed int dx = 0;
signed int dy = 0;
for(Boid & b : boids){
if (boid != b){ // this checks an "id" number, not location
if (b.dist(boid) < MIN_DIST){
dx -= b.get_x() - boid.get_x();
dy -= b.get_y() - boid.get_y();
}
}
}
return std::pair <signed int, signed int> (dx, dy);
}
with
MIN_DIST = 100;
unsigned int Boid::dist(const Boid & b){
return (unsigned int) sqrt((b.x - x) * (b.x - x) + (b.y - y) * (b.y - y));
}
the only major difference is between these two codes should be that instead of vector c, im using the components instead.
the order of functions i am using to move each boid around is:
center_of_mass(boids, new_boids[i]); // rule 1
match_velocity(boids, new_boids[i]); // rule 3
keep_within_bound(new_boids[i]);
tendency_towards_place(new_boids[i], mouse_x, mouse_y);
keep_distance(boids, new_boids[i]); // rule 2
is there something obvious im not seeing? maybe some silly vector arithmetic i did wrong?

The rule doesn't say that boids cannot collide. They just don't want to. :)
As you can see in this snippet:
FOR EACH BOID b
v1 = rule1(b)
v2 = rule2(b)
v3 = rule3(b)
b.velocity = b.velocity + v1 + v2 + v3
b.position = b.position + b.velocity
END
There is no check to make sure they don't collide. If the numbers come out unfavorably they will still collide.
That being said, if you get the exact same position for multiple boids it is still very unlikely, though. It would point to a programming error.

Later in the article he has this code:
ROCEDURE move_all_boids_to_new_positions()
Vector v1, v2, v3, ...
Integer m1, m2, m3, ...
Boid b
FOR EACH BOID b
v1 = m1 * rule1(b)
v2 = m2 * rule2(b)
v3 = m3 * rule3(b)
b.velocity = b.velocity + v1 + v2 + v3 + ...
b.position = b.position + b.velocity
END
END PROCEDURE
(Though realistically I would make m1 a double rather than an Integer) If rule1 is the poorly named rule that makes boids attempt to avoid each other, simply increase the value of m1 and they will turn faster away from each other. Also, increasingMIN_DIST will cause them to notice that they're about to run into each other sooner, and decreasing their maximum velocity (vlim in the function limit_velocity) will allow them to react more sanely to near collisions.
As others mentioned, there's nothing that 100% guarantees collisions don't happen, but these tweaks will make collisions less likely.

glsl cast bool to float

I want to set a float value to 1.0 if one vector equals another, and 0.0 if the vectors are not equal
if( v1 == v2 ) floatVal = 1.0 ;
else floatVal = 0.0 ;
But wouldn't it be "faster" or an optimization just to set
floatVal = (v1 == v2) ;
But it doesn't work. You can't implicitly (or explicitly) convert float to bool? Is there a way to do this or do I have to use the if statement branch?

Didn't you try "float(bool)" function?
GLSLangSpec.Full.1.20.8.pdf section 5.4.1 says you can do all those conversions.

CuriousChettai's right. Just write:
floatVal = float(v1 == v2);
GLSL gives you a compile-error if you assign values with possible loss of precision. So you can do things like:
float f = 3; // works
int i = 3.0; // compiler-error
int j = int(3.0); // works

Is there a C/C++ function to safely handle division by zero?

We have a situation we want to do a sort of weighted average of two values w1 & w2, based on how far two other values v1 & v2 are away from zero... for example:
If v1 is zero, it doesn't get weighted at all so we return w2
If v2 is zero, it doesn't get weighted at all so we return w1
If both values are equally far from zero, we do a mean average and return (w1 + w2 )/2
I've inherited code like:
float calcWeightedAverage(v1,v2,w1,w2)
{
v1=fabs(v1);
v2=fabs(v2);
return (v1/(v1+v2))*w1 + (v2/(v1+v2)*w2);
}
For a bit of background, v1 & v2 represent how far two different knobs are turned, the weighting of their individual resultant effects only depends how much they are turned, not in which direction.
Clearly, this has a problem when v1==v2==0, since we end up with return (0/0)*w1 + (0/0)*w2 and you can't do 0/0. Putting a special test in for v1==v2==0 sounds horrible mathematically, even if it wasn't bad practice with floating-point numbers.
So I wondered if
there was a standard library function to handle this
there's a neater mathematical representation

You're trying to implement this mathematical function:
F(x, y) = (W1 * |x| + W2 * |y|) / (|x| + |y|)
This function is discontinuous at the point x = 0, y = 0. Unfortunately, as R. stated in a comment, the discontinuity is not removable - there is no sensible value to use at this point.
This is because the "sensible value" changes depending on the path you take to get to x = 0, y = 0. For example, consider following the path F(0, r) from r = R1 to r = 0 (this is equivalent to having the X knob at zero, and smoothly adjusting the Y knob down from R1 to 0). The value of F(x, y) will be constant at W2 until you get to the discontinuity.
Now consider following F(r, 0) (keeping the Y knob at zero and adjusting the X knob smoothly down to zero) - the output will be constant at W1 until you get to the discontinuity.
Now consider following F(r, r) (keeping both knobs at the same value, and adjusting them down simulatneously to zero). The output here will be constant at W1 + W2 / 2 until you go to the discontinuity.
This implies that any value between W1 and W2 is equally valid as the output at x = 0, y = 0. There's no sensible way to choose between them. (And further, always choosing 0 as the output is completely wrong - the output is otherwise bounded to be on the interval W1..W2 (ie, for any path you approach the discontinuity along, the limit of F() is always within that interval), and 0 might not even lie in this interval!)
You can "fix" the problem by adjusting the function slightly - add a constant (eg 1.0) to both v1 and v2 after the fabs(). This will make it so that the minimum contribution of each knob can't be zero - just "close to zero" (the constant defines how close).
It may be tempting to define this constant as "a very small number", but that will just cause the output to change wildly as the knobs are manipulated close to their zero points, which is probably undesirable.

This is the best I could come up with quickly
float calcWeightedAverage(float v1,float v2,float w1,float w2)
{
float a1 = 0.0;
float a2 = 0.0;
if (v1 != 0)
{
a1 = v1/(v1+v2) * w1;
}
if (v2 != 0)
{
a2 = v2/(v1+v2) * w2;
}
return a1 + a2;
}

I don't see what would be wrong with just doing this:
float calcWeightedAverage( float v1, float v2, float w1, float w2 ) {
static const float eps = FLT_MIN; //Or some other suitably small value.
v1 = fabs( v1 );
v2 = fabs( v2 );
if( v1 + v2 < eps )
return (w1+w2)/2.0f;
else
return (v1/(v1+v2))*w1 + (v2/(v1+v2)*w2);
}
Sure, no "fancy" stuff to figure out your division, but why make it harder than it has to be?

Personally I don't see anything wrong with an explicit check for divide by zero. We all do them, so it could be argued that not having it is uglier.
However, it is possible to turn off the IEEE divide by zero exceptions. How you do this depends on your platform. I know on windows it has to be done process-wide, so you can inadvertantly mess with other threads (and they with you) by doing it if you aren't careful.
However, if you do that your result value will be NaN, not 0. I highly dooubt that's what you want. If you are going to have to put a special check in there anyway with different logic when you get NaN, you might as well just check for 0 in the denominator up front.

So with a weighted average, you need to look at the special case where both are zero. In that case you want to treat it as 0.5 * w1 + 0.5 * w2, right? How about this?
float calcWeightedAverage(float v1,float v2,float w1,float w2)
{
v1=fabs(v1);
v2=fabs(v2);
if (v1 == v2) {
v1 = 0.5;
} else {
v1 = v1 / (v1 + v2); // v1 is between 0 and 1
}
v2 = 1 - v1; // avoid addition and division because they should add to 1
return v1 * w1 + v2 * w2;
}

You chould test for fabs(v1)+fabs(v2)==0 (this seems to be the fastest given that you've already computed them), and return whatever value makes sense in this case (w1+w2/2?). Otherwise, keep the code as-is.
However, I suspect the algorithm itself is broken if v1==v2==0 is possible. This kind of numerical instability when the knobs are "near 0" hardly seems desirable.
If the behavior actually is right and you want to avoid special-cases, you could add the minimum positive floating point value of the given type to v1 and v2 after taking their absolute values. (Note that DBL_MIN and friends are not the correct value because they're the minimum normalized values; you need the minimum of all positive values, including subnormals.) This will have no effect unless they're already extremely small; the additions will just yield v1 and v2 in the usual case.

The problem with using an explicit check for zero is that you can end up with discontinuities in behaviour unless you are careful as outlined in cafs response ( and if its in the core of your algorithm the if can be expensive - but dont care about that until you measure...)
I tend to use something that just smooths out the weighting near zero instead.
float calcWeightedAverage(v1,v2,w1,w2)
{
eps = 1e-7; // Or whatever you like...
v1=fabs(v1)+eps;
v2=fabs(v2)+eps;
return (v1/(v1+v2))*w1 + (v2/(v1+v2)*w2);
}
Your function is now smooth, with no asymptotes or division by zero, and so long as one of v1 or v2 is above 1e-7 by a significant amount it will be indistinguishable from a "real" weighted average.

If the denominator is zero, how do you want it to default? You can do something like this:
static inline float divide_default(float numerator, float denominator, float default) {
return (denominator == 0) ? default : (numerator / denominator);
}
float calcWeightedAverage(v1, v2, w1, w2)
{
v1 = fabs(v1);
v2 = fabs(v2);
return w1 * divide_default(v1, v1 + v2, 0.0) + w2 * divide_default(v2, v1 + v2, 0.0);
}
Note that the function definition and use of static inline should really let the compiler know that it can inline.

This should work
#include <float.h>
float calcWeightedAverage(v1,v2,w1,w2)
{
v1=fabs(v1);
v2=fabs(v2);
return (v1/(v1+v2+FLT_EPSILON))*w1 + (v2/(v1+v2+FLT_EPSILON)*w2);
}
edit:
I saw there may be problems with some precision so instead of using FLT_EPSILON use DBL_EPSILON for accurate results (I guess you will return a float value).

I'd do like this:
float calcWeightedAverage(double v1, double v2, double w1, double w2)
{
v1 = fabs(v1);
v2 = fabs(v2);
/* if both values are equally far from 0 */
if (fabs(v1 - v2) < 0.000000001) return (w1 + w2) / 2;
return (v1*w1 + v2*w2) / (v1 + v2);
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Marsaglia Normal Random Variables in C++ - c++

Related

Quadratic Algebra advice for Array like return function

How to do logarithmic binning on a histogram?

Boids colliding with each other

glsl cast bool to float

Is there a C/C++ function to safely handle division by zero?

Categories

Resources