Test of lot of math operations in a class - c++

Is there a way of testing functions inside a class in an easy way for correct results? I mean, I have been looking at google test unit testing, but seems more to find fails in the work classes and functions, more than in the expected result.
For example, from math theory one could know which is the square root of all numbers, now you want to check a sqrt function, seeking for floating point precision errors, and then you also want to check lot of functions that use floats and look for any precision error, is there a way to make this easy and fast ?

I can think of 2 direct solutions
1)
one of the easiest ways to test for accuracy of mathematical functions is similar to what is used as definition work for limits in calculus. taking the value to be tested, and then also using a value that is "close" on both sides. I have heard of analogies drawn between limit analysis and unit testing, but keep in mind that if your looking for speed this will not be your best options. and that this will only work on continues operations, and that this analogy is for definition work only
so what you would do is have a "limitDomain" variable defined per function (this is because some operations are more accurate then others for reasoning look up taylor approximation of [function]), and then use that as you limiter. then test: low, high, and then the value itself, and then take the avg of all three within a given margin of error,
float testMathOpX(float _input){
float low = 0.0f;
float high = 0.0f;
low = _input - limitDomainOpX;
high = _input + limitDomainOpX;
low = OpX(low);
_input = OpX(_input);
high = OpX(high);
// doing 3 separate averages with division by 2 mains the worst decimal you will have is a trailing 5, or in some cases a trailing 25
low = (low + _input)/2
high = (_input + high)/2;
_input = (low + high)/2
return _input;
}
2)
the other method that I can think of is more of a table of values approach being that you take the input, and then check to see where on the domain of the operation it lies, and if it lies within certain values then you use value replacement. The thing to realize is that you need to have a lot of work ahead of time to get these table of values, and then it becomes just domain testing of the value your taking in in the form of:
if( (_input > valLow) && (_input < valHigh)){
... replace the value with an empirically found value
}
the problem with this is that you need o find those empirically found values.

Do you have requirements on the precision or do you want to find the precision?
If it is the former, then it is not hard to create test cases using any test framework.
y = myfunc(x);
if (y > expected_y + allowed_error || y < expected_y - allowed_error) {
// Test failed
...
}
Edit:
There are two routes to finding the precision, through testing and through algorithm analysis.
Testing should be straightforward: Compare the output with the correct values (which you have to obtain in some way).
Algortithm analysis is when you calculate the expected size of the error by calculating the error of the algorithm and the error caused by lack of precision in floating point arithmetic.

Related

Optimization of cbrt() in C++

I am trying to improve the speed of my code, written in C++. Based on profilers, the function cbrt()/cbrtf32x is the function I spend the most time in/on (or more specifically):
double test_func(const double &test_val){
double cbrt_test_val = cbrt(test_val);
return (1 - 1e-10*cbrt_test_val);
}
According to data, I spend more then three times the time for cbrt()/cbrtf32x() than for the closest cost-expensive function. Thus I was wondering how to improve this function, and how to speed it up? The input values range from 1e18 to 1e30.
There is little that can be done if you are doing the cubic roots one at a time, and you want the exact result.
As is, I would be surprised if you can improve the cubic root calculation more than 10-20% - if that - while getting the same result numerically. (Note: I got that 10%-20% number out of thin air; it's an opinion, not a scientific number at all.)
If you can batch up the calculations, you might be able to SIMD the operation, or multi-thread them, or if you know more about the distribution of the data (or can find out more,) you might be able to sort them and - I don't know - maybe calculate an incremental cubic root or something.
If you can get away with an approximation, then there are more things that you can do. For example, you are calculating the function f(x) = 1 - cbrt(x) / 1e10, which is the same as 1 - cbrt(x / 1e30) which is a strictly decreasing function that maps the domain [1e18..1e30] to the range [0..0.9999]. With y = x / 1e30 it becomes f(y) = 1 - cbrt(y) and now y is in the range [1e-12..1] and it can be pre-calculated and approximated using a look-up table.
Depending on the number of times you need a cubic root, how much accuracy loss you can get away with (which determines the size of the table,) and whether you can sort or bucket your input (to improve the CPU cache utilization for your LUT look-ups) you might get a nice speed boost out of this.

How to sample from a normal distribution restricted to a certain interval, C++ implementation?

With this function I can sample from a normal distribution. I was wondering how could I sample efficiently from a normal distribution restricted to a certain interval [a,b]. My trivial approach would be to sample from the normal distribution and then keep the value if it belongs to a certain interval, otherwise re-sample. However would probably discards many values before I get a suitable one.
I could also approximate the normal distribution using a triangular distrubution, however I don't think this would be accurate enough.
I could also try to work on the cumulative function, but probably this would be slow as well. Is there any efficient approach to the problem?
Thx
I'm assuming you know how to transform to and from standard normal with shifting by μ and scaling by σ.
Option 1, as you said, is acceptance/rejection. Generate normals as usual, reject them if they're outside the range [a, b]. It's not as inefficient as you might think. If p = P{a < Z < b}, then the number of trials required follows a geometric distribution with parameter p and the expected number of attempts before accepting a value is 1/p.
Option 2 is to use an inverse Gaussian function, such as the one in boost. Calculate lo = Φ(a) and hi = Φ(b), the probabilities of your normal being below a and b, respectively. Then generate U distributed uniformly between lo and hi, and crank the resulting set of U's through the inverse Gaussian function and rescale to get outcomes with the desired truncated distribution.
The normal distribution is an integral, see the formula:
std::cout << "riemann_midpnt_sum = " << 1 / (sqrt(2*PI)) * riemann_mid_point_sum(fctn, -1, 1.0, 100) << '\n';
// where fctn is the function inside the integral
double fctn(double x) {
return exp(-(x*x)/2);
}
output: "riemann_midpnt_sum = 0.682698"
This calculates the normal distribution (standard) from -1 to 1.
This is using a riemman sum approximate the integral. You can take the riemman sum from here
You could have a look at the implementation of the normal dist function in your standard library (e.g., https://gcc.gnu.org/onlinedocs/gcc-4.6.3/libstdc++/api/a00277.html), and figure out a way to re-implement this with your constraint.
It might be tricky to understand the template-heavy library code, but if you really need speed then the trivial approach is not well suited, particularly if your interval is quite small.

What's the origin of this GLSL rand() one-liner?

I've seen this pseudo-random number generator for use in shaders referred to here and there around the web:
float rand(vec2 co){
return fract(sin(dot(co.xy ,vec2(12.9898,78.233))) * 43758.5453);
}
It's variously called "canonical", or "a one-liner I found on the web somewhere".
What's the origin of this function? Are the constant values as arbitrary as they seem or is there some art to their selection? Is there any discussion of the merits of this function?
EDIT: The oldest reference to this function that I've come across is this archive from Feb '08, the original page now being gone from the web. But there's no more discussion of it there than anywhere else.
Very interesting question!
I am trying to figure this out while typing the answer :)
First an easy way to play with it: http://www.wolframalpha.com/input/?i=plot%28+mod%28+sin%28x*12.9898+%2B+y*78.233%29+*+43758.5453%2C1%29x%3D0..2%2C+y%3D0..2%29
Then let's think about what we are trying to do here: For two input coordinates x,y we return a "random number". Now this is not a random number though. It's the same every time we input the same x,y. It's a hash function!
The first thing the function does is to go from 2d to 1d. That is not interesting in itself, but the numbers are chosen so they do not repeat typically. Also we have a floating point addition there. There will be a few more bits from y or x, but the numbers might just be chosen right so it does a mix.
Then we sample a black box sin() function. This will depend a lot on the implementation!
Lastly it amplifies the error in the sin() implementation by multiplying and taking the fraction.
I don't think this is a good hash function in the general case. The sin() is a black box, on the GPU, numerically. It should be possible to construct a much better one by taking almost any hash function and converting it. The hard part is to turn the typical integer operation used in cpu hashing into float (half or 32bit) or fixed point operations, but it should be possible.
Again, the real problem with this as a hash function is that sin() is a black box.
The origin is probably the paper: "On generating random numbers, with help of y= [(a+x)sin(bx)] mod 1", W.J.J. Rey, 22nd European Meeting of Statisticians and the 7th Vilnius Conference on Probability Theory and Mathematical Statistics, August 1998
EDIT: Since I can't find a copy of this paper and the "TestU01" reference may not be clear, here's the scheme as described in TestU01 in pseudo-C:
#define A1 ???
#define A2 ???
#define B1 pi*(sqrt(5.0)-1)/2
#define B2 ???
uint32_t n; // position in the stream
double next() {
double t = fract(A1 * sin(B1*n));
double u = fract((A2+t) * sin(B2*t));
n++;
return u;
}
where the only recommended constant value is the B1.
Notice that this is for a stream. Converting to a 1D hash 'n' becomes the integer grid. So my guess is that someone saw this and converted 't' into a simple function f(x,y). Using the original constants above that would yield:
float hash(vec2 co){
float t = 12.9898*co.x + 78.233*co.y;
return fract((A2+t) * sin(t)); // any B2 is folded into 't' computation
}
the constant values are arbitrary, especially that they are very large, and a couple of decimals away from prime numbers.
a modulus over 1 of a hi amplitude sinus multiplied by 4000 is a periodic function. it's like a window blind or a corrugated metal made very small because it's multiplied by 4000, and turned at an angle by the dot product.
as the function is 2-D, the dot product has the effect of turning the periodic function at an oblique relative to X and Y axis. By 13/79 ratio approximately. It is inefficient, you can actually achieve the same by doing sinus of (13x + 79y) this will also achieve the same thing I think with less maths..
If you find the period of the function in both X and Y, you can sample it so that it will look like a simple sine wave again.
Here is a picture of it zoomed in graph
I don't know the origin but it is similar to many others, if you used it in graphics at regular intervals it would tend to produce moire patterns and you could see it's eventually goes around again.
Maybe it's some non-recurrent chaotic mapping, then it could explain many things, but also can be just some arbitrary manipulation with large numbers.
EDIT: Basically, the function fract(sin(x) * 43758.5453) is a simple hash-like function, the sin(x) provides smooth sin interpolation between -1 to 1, so sin(x) * 43758.5453 will be interpolation from -43758.5453 to 43758.5453. This is a quite huge range, so even small step in x will provide large step in result and really large variation in fractional part. The "fract" is needed to get values in range -0.99... to 0.999... .
Now, when we have something like hash function we should create function for production hash from the vector. The simplest way is call "hash" separetly for x any y component of the input vector. But then, we will have some symmetrical values. So, we should get some value from the vector, the approach is find some random vector and find "dot" product to that vector, here we go: fract(sin(dot(co.xy ,vec2(12.9898,78.233))) * 43758.5453);
Also, according to the selected vector, its lenght should be long engough to have several peroids of the "sin" function after "dot" product will be computed.
I do not believe this to be the true origin, but OP's code is presented as code example in "The Book of Shaders" by Patricio Gonzalez Vivo and Jen Lowe ( https://thebookofshaders.com/10/ ). In their code, Patricio Gonzales Vivo is cited as the author, i.e "// Author #patriciogv - 2015"
Since the OP's research dates back even further (to '08), the source might at least explain its popularity, and the author might be able to shed some light on his source.

How do I most effectively prevent my normally-distributed random variable from being zero?

I'm writing a Monte Carlo algorithm, in which at one point I need to divide by a random variable. More precisely: the random variable is used as a step width for a difference quotient, so I actually first multiply something by the variable and then again divide it out of some locally linear function of this expression. Like
double f(double);
std::tr1::variate_generator<std::tr1::mt19937, std::tr1::normal_distribution<> >
r( std::tr1::mt19937(time(NULL)),
std::tr1::normal_distribution<>(0) );
double h = r();
double a = ( f(x+h) - f(x) ) / h;
This works fine most of the time, but fails when h=0. Mathematically, this is not a concern because in any finite (or, indeed, countable) selection of normally-distributed random variables, all of them will be nonzero with probability 1. But in the digital implementation I will encounter an h==0 every ≈2³² function calls (regardless of the mersenne twister having a period longer than the universe, it still outputs ordinary longs!).
It's pretty simple to avoid this trouble, at the moment I'm doing
double h = r();
while (h==0) h=r();
but I don't consider this particularly elegant. Is there any better way?
The function I'm evaluating is actually not just a simple ℝ->ℝ like f is, but an ℝᵐxℝⁿ -> ℝ in which I calculate the gradient in the ℝᵐ variables while numerically integrating over the ℝⁿ variables. The whole function is superimposed with unpredictable (but "coherent") noise, sometimes with specific (but unknown) outstanding frequencies, that's what gets me into trouble when I try it with fixed values for h.
your way seems elegant enough, maybe a little different:
do {
h = r();
} while (h == 0.0);
The ratio of two normally-distributed random variables is the Cauchy distribution. The Cauchy distribution is one of those nasty distributions with an infinite variance. Very nasty indeed. A Cauchy distribution will make a mess of your Monte Carlo experiment.
In many cases where the ratio of two random variables is computed, the denominator is not normal. People often use a normal distribution to approximate this non-normally distributed random variable because
normal distributions are usually so easy to work with,
usually have such nice mathematical properties,
the normal assumption appears to be more or less correct, and
the real distribution is a bear.
Suppose you are dividing by distance. Distance is semi-positive definite by definition, and is often positive definite as a random variable. So right off the bat distance can never be normally distributed. Nonetheless, people often assume a normal distribution for distance in cases where the mean is much, much larger than the standard deviation. When this normal assumption is made you need to protect against those non-real values. One simple solution is a truncated normal.
If you want to preserve normal distribution you have to either exclude 0 or assign 0 to a new previously non-occurring value. Since the second is most likely not possible in the finite ranges of computer science the first is our only option.
A function (f(x+h)-f(x))/h has a limit as h->0 and therefore if you encounter h==0 you should use that limit. The limit would be f'(x) so if you know the derivative you can use it.
If what you are actually doing is creating number of discrete points though that approximate a normal distribution, and this is good enough for your distribution, create it in a way that none of them will actually have the value 0.
Depending on what you're trying to compute, perhaps something like this would work:
double h = r();
double a;
if (h != 0)
a = ( f(x+h) - f(x) ) / h;
else
a = 0;
If f is a linear function, this should (I think?) remain continuous at h = 0.
You might also want to instead consider trapping division-by-zero exceptions to avoid the cost of the branch. Note that this may or may not have a detrimental effect on performance - benchmark both ways!
On Linux, you will need to build the file that contains your potential division by zero with -fnon-call-exceptions, and install a SIGFPE handler:
struct fp_exception { };
void sigfpe(int) {
signal(SIGFPE, sigfpe);
throw fp_exception();
}
void setup() {
signal(SIGFPE, sigfpe);
}
// Later...
try {
run_one_monte_carlo_trial();
} catch (fp_exception &) {
// skip this trial
}
On Windows, use SEH:
__try
{
run_one_monte_carlo_trial();
}
__except(GetExceptionCode() == EXCEPTION_INT_DIVIDE_BY_ZERO ?
EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
{
// skip this trial
}
This has the advantage of potentially having less effect on the fast path. There is no branch, although there may be some adjustment of exception handler records. On Linux, there may be a small performance hit due to the compiler generating more conservative code for for -fnon-call-exceptions. This is less likely to be a problem if the code compiled under -fnon-call-exceptions does not allocate any automatic (stack) C++ objects. It's also worth noting that this makes the case in which division by zero does happen VERY expensive.

My code works, but I'm not satisfied with the efficiency. (C++ int to binary to scaled double)

Code pasted here
Hello SO. I just wrote my first semi-significant PC program, written purely for fun/to solve a problem, having not been assigned the problem in a programming class. I'm sure many of you remember the first significant program that you wrote for fun.
My issue is, I am not satisfied with the efficiency of my code. I'm not sure if it's the I/O limitation of my terminal or my code itself, but it seems to run quite slow for DAC resolutions of 8 bits or higher.
I haven't commented the code, so here is an explanation of the problem that I was attempting to solve with this program:
The output voltage of a DAC is determined by a binary number having bits Bn, Bn-1 ... B0, and a full-scale voltage.
The output voltage has an equation of the form:
Vo = G( (1/(2^(0)))*(Bn) + (1/2^(0+1))*(Bn-1) + ... + (1/2^(0+n))*(B0) )
Where G is the gain that would make an input B of all bits high the full-scale voltage.
If you run the code the idea will be quite clear.
My issue is, I think that what I'm outputting to the console can be achieved in much less than 108 lines of C++. Yes, it can easily be done by precomputing the step voltage and simply rendering the table by incrementation, but a "self-requirement" that I have for this program is that on some level it performs the series calculation described above for each binary represented input.
I'm not trying to be crazy with that requirement. I'd like this program to prove the nature of the formula, which it currently does. What I'm looking for is some suggestions on how to make my implementation generally cleaner and more efficient.
pow(2.0, x) is almost always a bad idea - especially if you're iterating over x. pow(2.0,0) == 1, and pow(2.0,x) == 2 * pow(2.0,x-1)
DtoAeqn returns an unbalanced string (one more )), that hurts my brain.
You could use Horner's method to evaluate the formula efficiently. Here's an example I use to demonstrate it for converting binary strings to decimal:
0.1101 = (1*2-1 + 1*2-2 + 0*2-3 + 1*2-4).
To evaluate this expression efficiently, rewrite the terms from right to left, then nest and evaluate as follows:
(((1 * 2-1 + 0) * 2-1 + 1) * 2-1 + 1) * 2-1 = 0.8125.
This way, you've removed the "pow" function, and you only have to do one multiplication (division) for each bit.
By the way, I see your code allows up to 128 bits of precision. You won't be able to compute that accurately in a double.