Accuracy of Rosenbrock's test function calculation - c++

I want to calculate Rosenbrock's test function
I have implemented the following C/C++ code
#include <stdio.h>
/********/
/* MAIN */
/********/
int main()
{
const int N = 900000;
float *x = (float *)malloc(N * sizeof(float));
for (int i=0; i<N; i++) x[i] = 3.f;
float sum_host = 0.f;
for (int i=0; i<N-1; i++) {
float temp = (100.f * (x[i+1] - x[i] * x[i]) * (x[i+1] - x[i] * x[i]) + (x[i] - 1.f) * (x[i] - 1.f));
sum_host = sum_host + temp;
printf("%i %f %f\n", i, temp, sum_host);
}
printf("Result for Rosenbrock's test function calculation = %f\n", sum_host);
}
Since the x array is initialized to 3.f, then each summation term should be 3604.f, so that the final summation involving 899999 terms should be 3243596396. However, the result I get is 3229239296, with an absolute error of 14357100. If I measure the difference between two consecutive partial summations, I see that it is 3600.f for the early partial summations and then it drops to 3584 for the last ones, while it should always be 3604.f.
If I use Kahan summation algorithm as
sum_host = 0.f;
float c = 0.f;
for (int i=0; i<N-1; i++) {
float temp = (100.f * (x[i+1] - x[i] * x[i]) * (x[i+1] - x[i] * x[i]) + (x[i] - 1.f) * (x[i] - 1.f)) - c;
float t = sum_host + temp;
c = (t - sum_host) - temp;
sum_host = t;
}
the result I get is 3243596288, with a much smaller absolute error of 108.
I'm pretty sure that this effect I see should be ascribed to the precision of floating point arithmetics. Could someone confirm this and provide me an explanation of the mechanism according to which this occurs?

You compute temp = 3604.0f accurately at each iteration. The problem arises when you try adding 3604.0f to something else and round the result to the nearest float. floats store an exponent and a 23-bit significand, meaning any result with 1-bits more than 24 places apart is going to get rounded to something other than what it is.
Note that 3604 = 901 * 4 and the binary expansion of 901 is 1110000101; you'll start seeing roundoff once you start adding temp to something bigger than 2^24 * 4 = 67108864. (This happens when you run the code, too; it starts printing out 3600 as the difference between consecutive sum_host's right when sum_host exceeds 67108864.) You start seeing even more roundoff when you're adding temp to something bigger than 2^26 * 4; at that point, the second smallest '1' bit is getting swallowed as well.
Note that, after you do Kahan summation, sum_host is what you report AND c is -108. This is loosely because c is keeping track of the next most significant 24 bits.

Typical float is only good to maybe 7 digits of precision. Repeatedly adding 3604 to a number 100000x larger than it does not well accumulate the lesser significant digits.
Use double.

Related

Need help understanding this line in an FFT algorithm

In my program I have a function that performs the fast Fourier transform. I know there are very good implementations freely available, but this is a learning thing so I don't want to use those. I ended up finding this comment with the following implementation (it originated from the Italian entry for the FFT):
void transform(complex<double>* f, int N) //
{
ordina(f, N); //first: reverse order
complex<double> *W;
W = (complex<double> *)malloc(N / 2 * sizeof(complex<double>));
W[1] = polar(1., -2. * M_PI / N);
W[0] = 1;
for(int i = 2; i < N / 2; i++)
W[i] = pow(W[1], i);
int n = 1;
int a = N / 2;
for(int j = 0; j < log2(N); j++) {
for(int k = 0; k < N; k++) {
if(!(k & n)) {
complex<double> temp = f[k];
complex<double> Temp = W[(k * a) % (n * a)] * f[k + n];
f[k] = temp + Temp;
f[k + n] = temp - Temp;
}
}
n *= 2;
a = a / 2;
}
free(W);
}
I've made a lot of changes by now but this was my starting point. One of the changes I made was to not cache the twiddle factors, because I decided to see if it's needed first. Now I've decided I do want to cache them. The way this implementation seems to do it is it has this array W of length N/2, where every index k has the value . What I don't understand is this expression:
W[(k * a) % (n * a)]
Note that n * a is always equal to N/2. I get that this is supposed to be equal to , and I can see that , which this relies on. I also get that modulo can be used here because the twiddle factors are cyclic. But there's one thing I don't get: this is a length-N DFT, and yet only N/2 twiddle factors are ever calculated. Shouldn't the array be of length N, and the modulo should be by N?
But there's one thing I don't get: this is a length-N DFT, and yet only N/2 twiddle factors are ever calculated. Shouldn't the array be of length N, and the modulo should be by N?
The twiddle factors are equally spaced points on the unit circle, and there is an even number of points because N is a power-of-two. After going around half of the circle (starting at 1, going counter clockwise above the X-axis), the second half is a repeat of the first half but this time it's below the X-axis (the points can be reflected through the origin). That is why Temp is subtracted the second time. That subtraction is the negation of the twiddle factor.

Hyperbolic sine without math.h

im new to code and c++ for a homework assignment im to create a code for sinh without the math file. I understand the math behind sinh, but i have no idea how to code it, any help would be highly appreciated.
According to Wikipedia, there is a Taylor series for sinh:
sinh(x) = x + (pow(x, 3) / 3!) + (pow(x, 5) / 5!) + pow(x, 7) / 7! + ...
One challenge is that you are not allowed to use the pow function. The other is calculating the factorial.
The series is a sum of terms, so you'll need a loop:
double sum = 0.0;
for (unsigned int i = 0; i < NUMBER_OF_TERMS; ++i)
{
sum += Term(i);
}
You could implement Term as a separate function, but you may want to take advantage of declaring and using variables in the loop (that the function may not have access to).
Consider that pow(x, N) expands to x * x * x...
This means that in each iteration the previous value is multiplied by the present value. (This will come in handy later.)
Consider that N! expands to 1 * 2 * 3 * 4 * 5 * ...
This means that in each iteration, the previous value is multiplied by the iteration number.
Let's revisit the loop:
double sum = 0.0;
double power = 1.0;
double factorial = 1.0;
for (unsigned int i = 1; i <= NUMBER_OF_TERMS; ++i)
{
// Calculate pow(x, i)
power = power * x;
// Calculate x!
factorial = factorial * i;
}
One issue with the above loop is that the pow and factorial need to be calculated for each iteration, but the Taylor Series terms use the odd iterations. This is solved by calculated the terms for odd iterations:
for (unsigned int i = 1; i <= NUMBER_OF_TERMS; ++i)
{
// Calculate pow(x, i)
power = power * x;
// Calculate x!
factorial = factorial * i;
// Calculate sum for odd iterations
if ((i % 2) == 1)
{
// Calculate the term.
sum += //...
}
}
In summary, the pow and factorial functions are broken down into iterative pieces. The iterative pieces are placed into a loop. Since the Taylor Series terms are calculated with odd iteration values, a check is placed into the loop.
The actual calculation of the Taylor Series term is left as an exercise for the OP or reader.

Cache misses from random access array

I have an array of values corresponding to an integer indexed axis. I need to linearly interpolate from these values at specific double precision indices.
double indices[20];
double results[20];
double values[1000];
// ...
for (int i = 0; i < 20; i++)
{
double index = indices[i];
int indexInt = (int)index;
double frac = index - indexInt;
// Linear interpolation
result[i] = values[indexInt] * (1.0 - frac) + values[indexInt + 1] * frac;
}
Profiling shows that the result linear interpolation line is taking more program run time than expected, and my suspicion is that this is due to cache misses. The indices are sorted but not guaranteed to be close to each other, and do not have a constant stride. Is there a way to mitigate this?

Function for calculating Pi using taylor series in c++

So I'm at a loss on why my code isn't working, essentially the function I am writing calculates an estimate for Pi using the taylor series, it just crashes whenever I try run the program.
here is my code
#include <iostream>
#include <math.h>
#include <stdlib.h>
using namespace std;
double get_pi(double accuracy)
{
double estimate_of_pi, latest_term, estimated_error;
int sign = -1;
int n;
estimate_of_pi = 0;
n = 0;
do
{
sign = -sign;
estimated_error = 4 * abs(1.0 / (2*n + 1.0)); //equation for error
latest_term = 4 * (1.0 *(2.0 * n + 1.0)); //calculation for latest term in series
estimate_of_pi = estimate_of_pi + latest_term; //adding latest term to estimate of pi
n = n + 1; //changing value of n for next run of the loop
}
while(abs(latest_term)< estimated_error);
return get_pi(accuracy);
}
int main()
{
cout << get_pi(100);
}
the logic behind the code is the following:
define all variables
set estimate of pi to be 0
calculate a term from the taylor series and calculate the error in
this term
it then adds the latest term to the estimate of pi
the program should then work out the next term in the series and the error in it and add it to the estimate of pi, until the condition in the while statement is satisfied
Thanks for any help I might get
There are several errors in your function. See my comments with lines starting with "//NOTE:".
double get_pi(double accuracy)
{
double estimate_of_pi, latest_term, estimated_error;
int sign = -1;
int n;
estimate_of_pi = 0;
n = 0;
do
{
sign = -sign;
//NOTE: This is an unnecessary line.
estimated_error = 4 * abs(1.0 / (2*n + 1.0)); //equation for error
//NOTE: You have encoded the formula incorrectly.
// The RHS needs to be "sign*4 * (1.0 /(2.0 * n + 1.0))"
// ^^^^ ^
latest_term = 4 * (1.0 *(2.0 * n + 1.0)); //calculation for latest term in series
estimate_of_pi = estimate_of_pi + latest_term; //adding latest term to estimate of pi
n = n + 1; //changing value of n for next run of the loop
}
//NOTE: The comparison is wrong.
// The conditional needs to be "fabs(latest_term) > estimated_error"
// ^^^^ ^^^
while(abs(latest_term)< estimated_error);
//NOTE: You are calling the function again.
// This leads to infinite recursion.
// It needs to be "return estimate_of_pi;"
return get_pi(accuracy);
}
Also, the function call in main is wrong. It needs to be:
get_pi(0.001)
to indicate that if the absolute value of the term is less then 0.001, the function can return.
Here's an updated version of the function that works for me.
double get_pi(double accuracy)
{
double estimate_of_pi, latest_term;
int sign = -1;
int n;
estimate_of_pi = 0;
n = 0;
do
{
sign = -sign;
latest_term = sign * 4 * (1.0 /(2.0 * n + 1.0)); //calculation for latest term in series
estimate_of_pi += latest_term; //adding latest term to estimate of pi
++n; //changing value of n for next run of the loop
}
while(fabs(latest_term) > accuracy);
return estimate_of_pi;
}
Your return statement may be the cause.
Try returning "estimate_of_pi" instead of get_pi(accuracy).
Your break condition can be rewritten as
2*n + 1 < 1/(2*n + 1) => (2*n + 1)^2 < 1
and this will never be true for any positive n. Thus your loop will never end. After fixing this you should change the return statement to
return estimated_error;
You currently are calling the function recursively without an end (assuming you fixed the stop condition).
Moreoever you have a sign and the parameter accuracy that you do not use at all in the calculation.
My advice for such iterations would be to always break on some maximum number of iterations. In this case you know it converges (assuming you fix the maths), but in general you can never be sure that your iteration converges.

Different results between Debug and Release

I have the problem that my code returns different results when comparing debug to release. I checked that both modes use /fp:precise, so that should not be the problem. The main issue I have with this is that the complete image analysis (its an image understanding project) is completely deterministic, there's absolutely nothing random in it.
Another issue with this is the fact that my release build actually always returns the same result (23.014 for the image), while debug returns some random value between 22 and 23, which just should not be. I've already checked whether it may be thread related, but the only part in the algorithm which is multi-threaded returns the precisely same result for both debug and release.
What else may be happening here?
Update1: The code I now found responsible for this behaviour:
float PatternMatcher::GetSADFloatRel(float* sample, float* compared, int sampleX, int compX, int offX)
{
if (sampleX != compX)
{
return 50000.0f;
}
float result = 0;
float* pTemp1 = sample;
float* pTemp2 = compared + offX;
float w1 = 0.0f;
float w2 = 0.0f;
float w3 = 0.0f;
for(int j = 0; j < sampleX; j ++)
{
w1 += pTemp1[j] * pTemp1[j];
w2 += pTemp1[j] * pTemp2[j];
w3 += pTemp2[j] * pTemp2[j];
}
float a = w2 / w3;
result = w3 * a * a - 2 * w2 * a + w1;
return result / sampleX;
}
Update2:
This is not reproducible with 32bit code. While debug and release code will always result in the same value for 32bit, it still is different from the 64bit release version, and the 64bit debug still returns some completely random values.
Update3:
Okay, I found it to certainly be caused by OpenMP. When I disable it, it works fine. (both Debug and Release use the same code, and both have OpenMP activated).
Following is the code giving me trouble:
#pragma omp parallel for shared(last, bestHit, cVal, rad, veneOffset)
for(int r = 0; r < 53; ++r)
{
for(int k = 0; k < 3; ++k)
{
for(int c = 0; c < 30; ++c)
{
for(int o = -1; o <= 1; ++o)
{
/*
r: 2.0f - 15.0f, in 53 steps, representing the radius of blood vessel
c: 0-29, in steps of 1, representing the absorption value (collagene)
iO: 0-2, depending on current radius. Signifies a subpixel offset (-1/3, 0, 1/3)
o: since we are not sure we hit the middle, move -1 to 1 pixels along the samples
*/
int offset = r * 3 * 61 * 30 + k * 30 * 61 + c * 61 + o + (61 - (4*w+1))/2;
if(offset < 0 || offset == fSamples.size())
{
continue;
}
last = GetSADFloatRel(adapted, &fSamples.at(offset), 4*w+1, 4*w+1, 0);
if(bestHit > last)
{
bestHit = last;
rad = (r+8)*0.25f;
cVal = c * 2;
veneOffset =(-0.5f + (1.0f / 3.0f) * k + (1.0f / 3.0f) / 2.0f);
if(fabs(veneOffset) < 0.001)
veneOffset = 0.0f;
}
last = GetSADFloatRel(input, &fSamples.at(offset), w * 4 + 1, w * 4 + 1, 0);
if(bestHit > last)
{
bestHit = last;
rad = (r+8)*0.25f;
cVal = c * 2;
veneOffset = (-0.5f + (1.0f / 3.0f) * k + (1.0f / 3.0f) / 2.0f);
if(fabs(veneOffset) < 0.001)
veneOffset = 0.0f;
}
}
}
}
}
Note: with Release mode and OpenMP activated I get the same result as with deactivating OpenMP. Debug mode and OpenMP activated gets a different result, OpenMP deactivated gets the same result as with Release.
At least two possibilities:
Turning on optimization may result in the compiler reordering operations. This can introduce small differences in floating-point calculations when compared to the order executed in debug mode, where operation reordering does not occur. This may account for numerical differences between debug and release, but does not account for numerical differences from one run to the next in debug mode.
You have a memory-related bug in your code, such as reading/writing past the bounds of an array, using an uninitialized variable, using an unallocated pointer, etc. Try running it through a memory checker, such as the excellent Valgrind, to identify such problems. Memory related errors may account for non-deterministic behavior.
If you are on Windows, then Valgrind isn't available (pity), but you can look here for a list of alternatives.
To elaborate on my comment, this is the code that is most probably the root of your problem:
#pragma omp parallel for shared(last, bestHit, cVal, rad, veneOffset)
{
...
last = GetSADFloatRel(adapted, &fSamples.at(offset), 4*w+1, 4*w+1, 0);
if(bestHit > last)
{
last is only assigned to before it is read again so it is a good candidate for being a lastprivate variable, if you really need the value from the last iteration outside the parallel region. Otherwise just make it private.
Access to bestHit, cVal, rad, and veneOffset should be synchronised by a critical region:
#pragma omp critical
if (bestHit > last)
{
bestHit = last;
rad = (r+8)*0.25f;
cVal = c * 2;
veneOffset =(-0.5f + (1.0f / 3.0f) * k + (1.0f / 3.0f) / 2.0f);
if(fabs(veneOffset) < 0.001)
veneOffset = 0.0f;
}
Note that by default all variables, except the counters of parallel for loops and those defined inside the parallel region, are shared, i.e. the shared clause in your case does nothing unless you also apply the default(none) clause.
Another thing that you should be aware of is that in 32-bit mode Visual Studio uses x87 FPU math while in 64-bit mode it uses SSE math by default. x87 FPU does intermediate calculations using 80-bit floating point precision (even for calculations involving float only) while the SSE unit supports only the standard IEEE single and double precisions. Introducing OpenMP or any other parallelisation technique to a 32-bit x87 FPU code means that at certain points intermediate values should be converted back to the single precision of float and if done sufficiently many times a slight or significant difference (depending on the numerical stability of the algorithm) could be observed between the results from the serial code and the parallel one.
Based on your code, I would suggest that the following modified code would give you good parallel performance because there is no synchronisation at each iteration:
#pragma omp parallel private(last)
{
int rBest = 0, kBest = 0, cBest = 0;
float myBestHit = bestHit;
#pragma omp for
for(int r = 0; r < 53; ++r)
{
for(int k = 0; k < 3; ++k)
{
for(int c = 0; c < 30; ++c)
{
for(int o = -1; o <= 1; ++o)
{
/*
r: 2.0f - 15.0f, in 53 steps, representing the radius of blood vessel
c: 0-29, in steps of 1, representing the absorption value (collagene)
iO: 0-2, depending on current radius. Signifies a subpixel offset (-1/3, 0, 1/3)
o: since we are not sure we hit the middle, move -1 to 1 pixels along the samples
*/
int offset = r * 3 * 61 * 30 + k * 30 * 61 + c * 61 + o + (61 - (4*w+1))/2;
if(offset < 0 || offset == fSamples.size())
{
continue;
}
last = GetSADFloatRel(adapted, &fSamples.at(offset), 4*w+1, 4*w+1, 0);
if(myBestHit > last)
{
myBestHit = last;
rBest = r;
cBest = c;
kBest = k;
}
last = GetSADFloatRel(input, &fSamples.at(offset), w * 4 + 1, w * 4 + 1, 0);
if(myBestHit > last)
{
myBestHit = last;
rBest = r;
cBest = c;
kBest = k;
}
}
}
}
}
#pragma omp critical
if (bestHit > myBestHit)
{
bestHit = myBestHit;
rad = (rBest+8)*0.25f;
cVal = cBest * 2;
veneOffset =(-0.5f + (1.0f / 3.0f) * kBest + (1.0f / 3.0f) / 2.0f);
if(fabs(veneOffset) < 0.001)
veneOffset = 0.0f;
}
}
It only stores the values of the parameters that give the best hit in each thread and then at the end of the parallel region it computes rad, cVal and veneOffset based on the best values. Now there is only one critical region, and it is at the end of code. You can get around it also, but you would have to introduce additional arrays.
One thing to double check is that all variables are initialized. Many times un-optimized code (Debug mode) will initialize memory.
I would have said variable initialization in debug vs not there in release. But your results would not back this up (reliable result in release).
Does your code rely on any specific offsets or sizes? Debug build would place guards bytes around some allocations.
Could it be floating point related?
The debug floating point stack is different to the release which is built for more efficiency.
Look here: http://thetweaker.wordpress.com/2009/08/28/debugrelease-numerical-differences/
Just about any undefined behavior can account for this: uninitialized
variables, rogue pointers, multiple modifications of the same object
without an intervening sequence point, etc. etc. The fact that the
results are at times unreproduceable argues somewhat for an
uninitialized variable, but it can also occur from pointer problems or
bounds errors.
Be aware that optimization can change results, especially on an Intel.
Optimization can change which intermediate values spill to memory, and
if you've not carefully used parentheses, even the order of evaluation
in an expression. (And as we all know, in machine floating point, (a +
b) + c) != a + (b + c).) Still the results should be deterministic:
you will get different results according to the degree of optimization,
but for any set of optimization flags, you should get the same results.