Rcpp version of base-R seq drops values - c++

I wrote an Rcpp version of the base-R seq function.
library(Rcpp)
cppFunction('NumericVector seqC(double x, double y, double by) {
// length of result vector
int nRatio = (y - x) / by;
NumericVector anOut(nRatio + 1);
// compute sequence
int n = 0;
for (double i = x; i <= y; i = i + by) {
anOut[n] = i;
n += 1;
}
return anOut;
}')
For the following tests, it works just fine.
seqC(1, 11, 2)
[1] 1 3 5 7 9 11
seqC(1, 10, 2)
[1] 1 3 5 7 9 11
Also, it works (sometimes) when passing values with decimal digits rather than
integers.
seqC(0.43, 0.45, 0.001)
[1] 0.430 0.431 0.432 0.433 0.434 0.435 0.436 0.437 0.438 0.439 0.440 0.441 0.442 0.443 0.444 0.445 0.446 0.447 0.448 0.449 0.450
However, sometimes the function does not seem to work as expected since the last
entry of the sequence is being dropped (or rather, the output vector anOut
does not have the proper size), which - according to my rather scarce C++ skills,
may be attributed to some kind of rounding errors.
seqC(0.53, 0.59, 0.001)
[1] 0.530 0.531 0.532 0.533 0.534 0.535 0.536 0.537 0.538 0.539 0.540 0.541 0.542 0.543 0.544 0.545 0.546 0.547 0.548 0.549 0.550 0.551
[23] 0.552 0.553 0.554 0.555 0.556 0.557 0.558 0.559 0.560 0.561 0.562 0.563 0.564 0.565 0.566 0.567 0.568 0.569 0.570 0.571 0.572 0.573
[45] 0.574 0.575 0.576 0.577 0.578 0.579 0.580 0.581 0.582 0.583 0.584 0.585 0.586 0.587 0.588 0.589
In the last example, for instance, the last value (0.590) is missing. Does
anyone know how to fix this?

As noted by others, the problem you are experiencing is fundamentally a floating point arithmetic error. A common workaround is to scale your doubles up to sufficiently large integers, perform the task, and then adjust the result to the original scale of your inputs. I took a slightly different approach than #RHertel by letting the amount of scaling (adjust) be determined by the precision of the increment rather than using a fixed amount, but the idea is essentially the same.
#include <Rcpp.h>
struct add_multiple {
int incr;
int count;
add_multiple(int incr)
: incr(incr), count(0)
{}
inline int operator()(int d) {
return d + incr * count++;
}
};
// [[Rcpp::export]]
Rcpp::NumericVector rcpp_seq(double from_, double to_, double by_ = 1.0) {
int adjust = std::pow(10, std::ceil(std::log10(10 / by_)) - 1);
int from = adjust * from_;
int to = adjust * to_;
int by = adjust * by_;
std::size_t n = ((to - from) / by) + 1;
Rcpp::IntegerVector res = Rcpp::rep(from, n);
add_multiple ftor(by);
std::transform(res.begin(), res.end(), res.begin(), ftor);
return Rcpp::NumericVector(res) / adjust;
}
/*** R
all.equal(seq(.53, .59, .001), seqC(.53, .59, .001)) &&
all.equal(seq(.53, .59, .001), rcpp_seq(.53, .59, .001))
# [1] TRUE
all.equal(seq(.53, .54, .000001), seqC(.53, .54, .000001)) &&
all.equal(seq(.53, .54, .000001), rcpp_seq(.53, .54, .000001))
# [1] TRUE
microbenchmark::microbenchmark(
"seq" = seq(.53, .54, .000001),
"seqC" = seqC(0.53, 0.54, 0.000001),
"rcpp_seq" = rcpp_seq(0.53, 0.54, 0.000001),
times = 100L)
Unit: microseconds
expr min lq mean median uq max neval
seq 896.190 1015.7940 1167.4708 1132.466 1221.624 1651.571 100
seqC 212293.307 219527.6590 226933.4329 223384.592 227860.410 398462.561 100
rcpp_seq 182.848 194.1665 225.4338 227.396 244.942 320.436 100
*/
Where seqC was #RHertel's revised implementation that produced the correct result. FWIW I think the slow performance of this function is mainly do to the use of push_back on the NumericVector type, which the Rcpp developers strongly advise against.

The "<=" can create difficulties with floating point numbers. This is a variant of the famous question "Why are these numbers not equal?". Moreover, there is a similar issue with the vector length, which in the case of your last example should be 60, but it is actually calculated to be 59. This is most likely due to the conversion to an integer (by casting, i.e., truncation) of a value like 59.999999 or something similar.
It seems to be very difficult to fix these problems, so I have rewritten a considerable part of the code, hoping that now the function operates as required.
The following code should provide correct results for essentially any kind of increasing series (i.e., y > x, by > 0).
cppFunction('NumericVector seqC(double x, double y, double by) {
NumericVector anOut(1);
// compute sequence
double min_by = 1.e-8;
if (by < min_by) min_by = by/100;
double i = x + by;
anOut(0) = x;
while(i/min_by < y/min_by + 1) {
anOut.push_back(i);
i += by;
}
return anOut;
}')
Hope this helps. And thanks a lot to #Konrad Rudolph for pointing out mistakes in my previous attempts!

Related

Why is my Rcpp code is much slower than glmnet's?

I edited the lasso code from this site to use it for multiple lambda values.
I used lassoshooting package for one lambda value (this package works for one lambda value) and glmnet for multiple lambda values for comparison.
The coefficient estimates are different and this is expected because of standardization and scaling back to original scale. This is out of scope and not important here.
For one parameter case, lassoshooting is 1.5 times faster.
Both methods used all 100 lambda values in my code for multiple lambda case. But glmnet is 7.5 times faster than my cpp code. Of course, I expected that glmnet was faster, but this amount seems too much. Is it normal or is my code wrong?
EDIT
I also attached lshoot function which calculates coefficient path in an R loop. This outperforms my cpp code too.
Can I improve my cpp code?
C++ code:
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
using namespace Rcpp;
using namespace arma;
// [[Rcpp::export]]
vec softmax_cpp(const vec & x, const vec & y) {
return sign(x) % max(abs(x) - y, zeros(x.n_elem));
}
// [[Rcpp::export]]
mat lasso(const mat & X, const vec & y, const vec & lambda,
const double tol = 1e-7, const int max_iter = 10000){
int p = X.n_cols; int lam = lambda.n_elem;
mat XX = X.t() * X;
vec Xy = X.t() * y;
vec Xy2 = 2 * Xy;
mat XX2 = 2 * XX;
mat betas = zeros(p, lam); // to store the betas
vec beta = zeros(p); // initial beta for each lambda
bool converged = false;
int iteration = 0;
vec beta_prev, aj, cj;
for(int l = 0; l < lam; l++){
while (!converged && (iteration < max_iter)){
beta_prev = beta;
for (int j = 0; j < p; j++){
aj = XX2(j,j);
cj = Xy2(j) - dot(XX2.row(j), beta) + beta(j) * XX2(j,j);
beta(j) = as_scalar(softmax_cpp(cj / aj, as_scalar(lambda(l)) / aj));
}
iteration = iteration + 1;
converged = norm(beta_prev - beta, 1) < tol;
}
betas.col(l) = beta;
iteration = 0;
converged = false;
}
return betas;
}
R code:
library(Rcpp)
library(rbenchmark)
library(glmnet)
library(lassoshooting)
sourceCpp("LASSO.cpp")
library(ElemStatLearn)
X <- as.matrix(prostate[,-c(9,10)])
y <- as.matrix(prostate[,9])
lambda_one <- 0.1
benchmark(cpp=lasso(X,y,lambda_one),
lassoshooting=lassoshooting(X,y,lambda_one)$coefficients,
order="relative", replications=100)[,1:4]
################################################
lambda <- seq(0,10,len=100)
benchmark(cpp=lasso(X,y,lambda),
glmn=coef(glmnet(X,y,lambda=lambda)),
order="relative", replications=100)[,1:4]
####################################################
EDIT
lambda <- seq(0,10,len=100)
lshoot <- function(lambda){
betas <- matrix(NA,8,100)
for(l in 1:100){
betas[, l] <- lassoshooting(X,y,lambda[l])$coefficients
}
return(betas)
}
benchmark(cpp=lasso(X,y,lambda),
lassoshooting_loop=lshoot(lambda),
order="relative", replications=300)[,1:4]
Results for one parameter case:
test replications elapsed relative
2 lassoshooting 300 0.06 1.0
1 cpp 300 0.09 1.5
Results for multiple parameter case:
test replications elapsed relative
2 glmn 300 0.70 1.000
1 cpp 300 5.24 7.486
Results for lassoshooting loop and cpp:
test replications elapsed relative
2 lassoshooting_loop 300 4.06 1.000
1 cpp 300 6.38 1.571
Package {glmnet} uses warm starts and special rules for discarding lots of predictors, which makes fitting the whole "regularization path" very fast.
See their paper.

Is there any "standard" way to calculate the numerical gradient?

I am trying to calculate the numerical gradient of a smooth function in c++. And the parameter value could vary from zero to a very large number(maybe 1e10 to 1e20?)
I used the function f(x,y) = 10*x^3 + y^3 as a testbench, but I found that if x or y is too large, I can't get correct gradient.
Here is my code to calculate the graidient:
#include <iostream>
#include <cmath>
#include <cassert>
using namespace std;
double f(double x, double y)
{
// black box expensive function
return 10 * pow(x, 3) + pow(y, 3);
}
int main()
{
// double x = -5897182590.8347721;
// double y = 269857217.0017581;
double x = 1.13041e+19;
double y = -5.49756e+14;
const double epsi = 1e-4;
double f1 = f(x, y);
double f2 = f(x, y+epsi);
double f3 = f(x, y-epsi);
cout << f1 << endl;
cout << f2 << endl;
cout << f3 << endl;
cout << f1 - f2 << endl; // 0
cout << f2 - f3 << endl; // 0
return 0;
}
If I use the above code to calculate the gradient, the gradient would be zero!
The testbench function, 10*x^3 + y^3, is just a demo, the real problem I need to solve is actually a black box function.
So, is there any "standard" way to calculate the numerical gradient?
In the first place, you should use the central difference scheme, which is more accurate (by cancellation of one more term of the Taylor develoment).
(f(x + h) - f(x - h)) / 2h
rather than
(f(x + h) - f(x)) / h
Then the choice of h is critical and using a fixed constant is the worst thing you can do. Because for small x, h will be too large so that the approximation formula no more works, and for large x, h will be too small, resulting in severe truncation error.
A much better choice is to take a relative value, h = x√ε, where ε is the machine epsilon (1 ulp), which gives a good tradeoff.
(f(x(1 + √ε)) - f(x(1 - √ε))) / 2x√ε
Beware that when x = 0, a relative value cannot work and you need to fall back to a constant. But then, nothing tells you which to use !
You need to consider the precision needed.
At first glance, since |y| = 5.49756e14 and epsi = 1e-4, you need at least ⌈log2(5.49756e14)-log2(1e-4)⌉ = 63 bits of significand precision (that is the number of bits used to encode the digits of your number, also known as mantissa) for y and y+epsi to be considered different.
The double-precision floating-point format only has 53 bits of significand precision (assuming it is 8 bytes). So, currently, f1, f2 and f3 are exactly the same because y, y+epsi and y-epsi are equal.
Now, let's consider the limit : y = 1e20, and the result of your function, 10x^3 + y^3. Let's ignore x for now, so let's take f = y^3. Now we can calculate the precision needed for f(y) and f(y+epsi) to be different : f(y) = 1e60 and f(epsi) = 1e-12. This gives a minimum significand precision of ⌈log2(1e60)-log2(1e-12)⌉ = 240 bits.
Even if you were to use the long double type, assuming it is 16 bytes, your results would not differ : f1, f2 and f3 would still be equal, even though y and y+epsi would not.
If we take x into account, the maximum value of f would be 11e60 (with x = y = 1e20). So the upper limit on precision is ⌈log2(11e60)-log2(1e-12)⌉ = 243 bits, or at least 31 bytes.
One way to solve your problem is to use another type, maybe a bignum used as fixed-point.
Another way is to rethink your problem and deal with it differently. Ultimately, what you want is f1 - f2. You can try to decompose f(y+epsi). Again, if you ignore x, f(y+epsi) = (y+epsi)^3 = y^3 + 3*y^2*epsi + 3*y*epsi^2 + epsi^3. So f(y+epsi) - f(y) = 3*y^2*epsi + 3*y*epsi^2 + epsi^3.
The only way to calculate gradient is calculus.
Gradient is a vector:
g(x, y) = Df/Dx i + Df/Dy j
where (i, j) are unit vectors in x and y directions, respectively.
One way to approximate derivatives is first order differences:
Df/Dx ~ (f(x2, y)-f(x1, y))/(x2-x1)
and
Df/Dy ~ (f(x, y2)-f(x, y1))/(y2-y1)
That doesn't look like what you're doing.
You have a closed form expression:
g(x, y) = 30*x^2 i + 3*y^2 j
You can plug in values for (x, y) and calculate the gradient exactly at any point. Compare that to your differences and see how well your approximation is doing.
How you implement it numerically is your responsibility. (10^19)^3 = 10^57, right?
What is the size of double on your machine? Is it a 64 bit IEEE double precision floating point number?
Use
dx = (1+abs(x))*eps, dfdx = (f(x+dx,y) - f(x,y)) / dx
dy = (1+abs(y))*eps, dfdy = (f(x,y+dy) - f(x,y)) / dy
to get meaningful step sizes for large arguments.
Use eps = 1e-8 for one-sided difference formulas, eps = 1e-5 for central difference quotients.
Explore automatic differentiation (see autodiff.org) for derivatives without difference quotients and thus much smaller numerical errors.
We can examine the behaviour of the error in the derivative using the following program - it calculates the 1-sided derivative and the central difference based derivative using a varying step size. Here I'm using x and y ~ 10^10, which is smaller than what you were using, but should illustrate the same point.
#include <iostream>
#include <cmath>
#include <cassert>
using namespace std;
double f(double x, double y) {
return 10 * pow(x, 3) + pow(y, 3);
}
double f_x(double x, double y) {
return 3 * 10 * pow(x,2);
}
double f_y(double x, double y) {
return 3 * pow(y,2);
}
int main()
{
// double x = -5897182590.8347721;
// double y = 269857217.0017581;
double x = 1.13041e+10;
double y = -5.49756e+10;
//double x = 10.1;
//double y = -5.2;
double epsi = 1e8;
for(int i=0; i<60; ++i) {
double dfx_n = (f(x+epsi,y) - f(x,y))/epsi;
double dfx_cd = (f(x+epsi,y) - f(x-epsi,y))/(2*epsi);
double dfx = f_x(x,y);
cout<<epsi<<" "<<fabs(dfx-dfx_n)<<" "<<fabs(dfx - dfx_cd)<<std::endl;
epsi/=1.5;
}
return 0;
}
The output shows that a 1-sided difference gets us an optimal error of about 1.37034e+13 at a step length of about 100.0. Note that while this error looks large, as a relative error it is 3.5746632302764072e-09 (since the exact value is 3.833e+21)
In comparison the 2-sided difference gets an optimal error of about 1.89493e+10 with a step size of about 45109.3. This is three-orders of magnitude better, (with a much larger step-size).
How can we work out the step size? The link in the comments of Yves Daosts answer gives us a ballpark value:
h=x_c sqrt(eps) for 1-Sided, and h=x_c cbrt(eps) for 2-Sided.
But either way, if the required step size for decent accuracy at x ~ 10^10 is 100.0, the required step size with x ~ 10^20 is going to be 10^10 larger too. So the problem is simply that your step size is way too small.
This can be verified by increasing the starting step-size in the above code and resetting the x/y values to the original values.
Then expected derivative is O(1e39), best 1-sided error of about O(1e31) occurs near a step length of 5.9e10, best 2-sided error of about O(1e29) occurs near a step length of 6.1e13.
As numerical differentiation is ill conditioned (which means a small error could alter your result significantly) you should consider to use Cauchy's integral formula. This way you can calculate the n-th derivative with an integral. This will lead to less problems with considering accuracy and stability.

Numerical precision for difference of squares

in my code I often compute things like the following piece (here C code for simplicity):
float cos_theta = /* some simple operations; no cosf call! */;
float sin_theta = sqrtf(1.0f - cos_theta * cos_theta); // Option 1
For this example ignore that the argument of the square root might be negative due to imprecisions. I fixed that with additional fdimf call. However, I wondered if the following is more precise:
float sin_theta = sqrtf((1.0f + cos_theta) * (1.0f - cos_theta)); // Option 2
cos_theta is between -1 and +1 so for each choice there will be situations where I subtract similar numbers and thus will loose precision, right? What is the most precise and why?
The most precise way with floats is likely to compute both sin and cos using a single x87 instruction, fsincos.
However, if you need to do the computation manually, it's best to group arguments with similar magnitudes. This means the second option is more precise, especially when cos_theta is close to 0, where precision matters the most.
As the article
What Every Computer Scientist Should Know About Floating-Point Arithmetic notes:
The expression x2 - y2 is another formula that exhibits catastrophic
cancellation. It is more accurate to evaluate it as (x - y)(x + y).
Edit: it's more complicated than this. Although the above is generally true, (x - y)(x + y) is slightly less accurate when x and y are of very different magnitudes, as the footnote to the statement explains:
In this case, (x - y)(x + y) has three rounding errors, but x2 - y2 has only two since the rounding error committed when computing the smaller of x2 and y2 does not affect the final subtraction.
In other words, taking x - y, x + y, and the product (x - y)(x + y) each introduce rounding errors (3 steps of rounding error). x2, y2, and the subtraction x2 - y2 also each introduce rounding errors, but the rounding error obtained by squaring a relatively small number (the smaller of x and y) is so negligible that there are effectively only two steps of rounding error, making the difference of squares more precise.
So option 1 is actually going to be more precise. This is confirmed by dev.brutus's Java test.
I wrote small test. It calcutates expected value with double precision. Then it calculates an error with your options. The first option is better:
Algorithm: FloatTest$1
option 1 error = 3.802792362162126
option 2 error = 4.333273185303996
Algorithm: FloatTest$2
option 1 error = 3.802792362167937
option 2 error = 4.333273185305868
The Java code:
import org.junit.Test;
public class FloatTest {
#Test
public void test() {
testImpl(new ExpectedAlgorithm() {
public double te(double cos_theta) {
return Math.sqrt(1.0f - cos_theta * cos_theta);
}
});
testImpl(new ExpectedAlgorithm() {
public double te(double cos_theta) {
return Math.sqrt((1.0f + cos_theta) * (1.0f - cos_theta));
}
});
}
public void testImpl(ExpectedAlgorithm ea) {
double delta1 = 0;
double delta2 = 0;
for (double cos_theta = -1; cos_theta <= 1; cos_theta += 1e-8) {
double[] delta = delta(cos_theta, ea);
delta1 += delta[0];
delta2 += delta[1];
}
System.out.println("Algorithm: " + ea.getClass().getName());
System.out.println("option 1 error = " + delta1);
System.out.println("option 2 error = " + delta2);
}
private double[] delta(double cos_theta, ExpectedAlgorithm ea) {
double expected = ea.te(cos_theta);
double delta1 = Math.abs(expected - t1((float) cos_theta));
double delta2 = Math.abs(expected - t2((float) cos_theta));
return new double[]{delta1, delta2};
}
private double t1(float cos_theta) {
return Math.sqrt(1.0f - cos_theta * cos_theta);
}
private double t2(float cos_theta) {
return Math.sqrt((1.0f + cos_theta) * (1.0f - cos_theta));
}
interface ExpectedAlgorithm {
double te(double cos_theta);
}
}
The correct way to reason about numerical precision of some expression is to:
Measure the result discrepancy relative to the correct value in ULPs (Unit in the last place), introduced in 1960. by W. H. Kahan. You can find C, Python & Mathematica implementations here, and learn more on the topic here.
Discriminate between two or more expressions based on the worst case they produce, not average absolute error as done in other answers or by some other arbitrary metric. This is how numerical approximation polynomials are constructed (Remez algorithm), how standard library methods' implementations are analysed (e.g. Intel atan2), etc...
With that in mind, version_1: sqrt(1 - x * x) and version_2: sqrt((1 - x) * (1 + x)) produce significantly different outcomes. As presented in the plot below, version_1 demonstrates catastrophic performance for x close to 1 with error > 1_000_000 ulps, while on the other hand error of version_2 is well behaved.
That is why I always recommend using version_2, i.e. exploiting the square difference formula.
Python 3.6 code that produces square_diff_error.csv file:
from fractions import Fraction
from math import exp, fabs, sqrt
from random import random
from struct import pack, unpack
def ulp(x):
"""
Computing ULP of input double precision number x exploiting
lexicographic ordering property of positive IEEE-754 numbers.
The implementation correctly handles the special cases:
- ulp(NaN) = NaN
- ulp(-Inf) = Inf
- ulp(Inf) = Inf
Author: Hrvoje Abraham
Date: 11.12.2015
Revisions: 15.08.2017
26.11.2017
MIT License https://opensource.org/licenses/MIT
:param x: (float) float ULP will be calculated for
:returns: (float) the input float number ULP value
"""
# setting sign bit to 0, e.g. -0.0 becomes 0.0
t = abs(x)
# converting IEEE-754 64-bit format bit content to unsigned integer
ll = unpack('Q', pack('d', t))[0]
# computing first smaller integer, bigger in a case of ll=0 (t=0.0)
near_ll = abs(ll - 1)
# converting back to float, its value will be float nearest to t
near_t = unpack('d', pack('Q', near_ll))[0]
# abs takes care of case t=0.0
return abs(t - near_t)
with open('e:/square_diff_error.csv', 'w') as f:
for _ in range(100_000):
# nonlinear distribution of x in [0, 1] to produce more cases close to 1
k = 10
x = (exp(k) - exp(k * random())) / (exp(k) - 1)
fx = Fraction(x)
correct = sqrt(float(Fraction(1) - fx * fx))
version1 = sqrt(1.0 - x * x)
version2 = sqrt((1.0 - x) * (1.0 + x))
err1 = fabs(version1 - correct) / ulp(correct)
err2 = fabs(version2 - correct) / ulp(correct)
f.write(f'{x},{err1},{err2}\n')
Mathematica code that produces the final plot:
data = Import["e:/square_diff_error.csv"];
err1 = {1 - #[[1]], #[[2]]} & /# data;
err2 = {1 - #[[1]], #[[3]]} & /# data;
ListLogLogPlot[{err1, err2}, PlotRange -> All, Axes -> False, Frame -> True,
FrameLabel -> {"1-x", "error [ULPs]"}, LabelStyle -> {FontSize -> 20}]
As an aside, you will always have a problem when theta is small, because the cosine is flat around theta = 0. If theta is between -0.0001 and 0.0001 then cos(theta) in float is exactly one, so your sin_theta will be exactly zero.
To answer your question, when cos_theta is close to one (corresponding to a small theta), your second computation is clearly more accurate. This is shown by the following program, that lists the absolute and relative errors for both computations for various values of cos_theta. The errors are computed by comparing against a value which is computed with 200 bits of precision, using GNU MP library, and then converted to a float.
#include <math.h>
#include <stdio.h>
#include <gmp.h>
int main()
{
int i;
printf("cos_theta abs (1) rel (1) abs (2) rel (2)\n\n");
for (i = -14; i < 0; ++i) {
float x = 1 - pow(10, i/2.0);
float approx1 = sqrt(1 - x * x);
float approx2 = sqrt((1 - x) * (1 + x));
/* Use GNU MultiPrecision Library to get 'exact' answer */
mpf_t tmp1, tmp2;
mpf_init2(tmp1, 200); /* use 200 bits precision */
mpf_init2(tmp2, 200);
mpf_set_d(tmp1, x);
mpf_mul(tmp2, tmp1, tmp1); /* tmp2 = x * x */
mpf_neg(tmp1, tmp2); /* tmp1 = -x * x */
mpf_add_ui(tmp2, tmp1, 1); /* tmp2 = 1 - x * x */
mpf_sqrt(tmp1, tmp2); /* tmp1 = sqrt(1 - x * x) */
float exact = mpf_get_d(tmp1);
printf("%.8f %.3e %.3e %.3e %.3e\n", x,
fabs(approx1 - exact), fabs((approx1 - exact) / exact),
fabs(approx2 - exact), fabs((approx2 - exact) / exact));
/* printf("%.10f %.8f %.8f %.8f\n", x, exact, approx1, approx2); */
}
return 0;
}
Output:
cos_theta abs (1) rel (1) abs (2) rel (2)
0.99999988 2.910e-11 5.960e-08 0.000e+00 0.000e+00
0.99999970 5.821e-11 7.539e-08 0.000e+00 0.000e+00
0.99999899 3.492e-10 2.453e-07 1.164e-10 8.178e-08
0.99999684 2.095e-09 8.337e-07 0.000e+00 0.000e+00
0.99998999 1.118e-08 2.497e-06 0.000e+00 0.000e+00
0.99996835 6.240e-08 7.843e-06 9.313e-10 1.171e-07
0.99989998 3.530e-07 2.496e-05 0.000e+00 0.000e+00
0.99968380 3.818e-07 1.519e-05 0.000e+00 0.000e+00
0.99900001 1.490e-07 3.333e-06 0.000e+00 0.000e+00
0.99683774 8.941e-08 1.125e-06 7.451e-09 9.376e-08
0.99000001 5.960e-08 4.225e-07 0.000e+00 0.000e+00
0.96837723 1.490e-08 5.973e-08 0.000e+00 0.000e+00
0.89999998 2.980e-08 6.837e-08 0.000e+00 0.000e+00
0.68377221 5.960e-08 8.168e-08 5.960e-08 8.168e-08
When cos_theta is not close to one, then the accuracy of both methods is very close to each other and to round-off error.
[Edited for major think-o] It looks to me like option 2 will be better, because for a number like 0.000001 for example option 1 will return the sine as 1 while option will return a number just smaller than 1.
No difference in my option since (1-x) preserves the precision not effecting the carried bit. Then for (1+x) the same is true. Then the only thing effecting the carry bit precision is the multiplication. So in both cases there is one single multiplication, so they are both as likely to give the same carry bit error.

Better way than if else if else... for linear interpolation

question is easy.
Lets say you have function
double interpolate (double x);
and you have a table that has map of known x-> y
for example
5 15
7 18
10 22
note: real tables are bigger ofc, this is just example.
so for 8 you would return 18+((8-7)/(10-7))*(22-18)=19.3333333
One cool way I found is
http://www.bnikolic.co.uk/blog/cpp-map-interp.html
(long story short it uses std::map, key= x, value = y for x->y data pairs).
If somebody asks what is the if else if else way in title
it is basically:
if ((x>=5) && (x<=7))
{
//interpolate
}
else
if((x>=7) && x<=10)
{
//interpolate
}
So is there a more clever way to do it or map way is the state of the art? :)
Btw I prefer soutions in C++ but obviously any language solution that has 1:1 mapping to C++ is nice.
Well, the easiest way I can think of would be using a binary search to find the point where your point lies. Try to avoid maps if you can, as they are very slow in practice.
This is a simple way:
const double INF = 1.e100;
vector<pair<double, double> > table;
double interpolate(double x) {
// Assumes that "table" is sorted by .first
// Check if x is out of bound
if (x > table.back().first) return INF;
if (x < table[0].first) return -INF;
vector<pair<double, double> >::iterator it, it2;
// INFINITY is defined in math.h in the glibc implementation
it = lower_bound(table.begin(), table.end(), make_pair(x, -INF));
// Corner case
if (it == table.begin()) return it->second;
it2 = it;
--it2;
return it2->second + (it->second - it2->second)*(x - it2->first)/(it->first - it2->first);
}
int main() {
table.push_back(make_pair(5., 15.));
table.push_back(make_pair(7., 18.));
table.push_back(make_pair(10., 22.));
// If you are not sure if table is sorted:
sort(table.begin(), table.end());
printf("%f\n", interpolate(8.));
printf("%f\n", interpolate(10.));
printf("%f\n", interpolate(10.1));
}
You can use a binary search tree to store the interpolation data. This is beneficial when you have a large set of N interpolation points, as interpolation can then be performed in O(log N) time. However, in your example, this does not seem to be the case, and the linear search suggested by RedX is more appropriate.
#include <stdio.h>
#include <assert.h>
#include <map>
static double interpolate (double x, const std::map<double, double> &table)
{
assert(table.size() > 0);
std::map<double, double>::const_iterator it = table.lower_bound(x);
if (it == table.end()) {
return table.rbegin()->second;
} else {
if (it == table.begin()) {
return it->second;
} else {
double x2 = it->first;
double y2 = it->second;
--it;
double x1 = it->first;
double y1 = it->second;
double p = (x - x1) / (x2 - x1);
return (1 - p) * y1 + p * y2;
}
}
}
int main ()
{
std::map<double, double> table;
table.insert(std::pair<double, double>(5, 6));
table.insert(std::pair<double, double>(8, 4));
table.insert(std::pair<double, double>(9, 5));
double y = interpolate(5.1, table);
printf("%f\n", y);
}
Store your points sorted:
index X Y
1 1 -> 3
2 3 -> 7
3 10-> 8
Then loop from max to min and as soon as you get below a number you know it the one you want.
You want let's say 6 so:
// pseudo
for i = 3 to 1
if x[i] <= 6
// you found your range!
// interpolate between x[i] and x[i - 1]
break; // Do not look any further
end
end
Yes, I guess that you should think in a map between those intervals and the natural nummbers. I mean, just label the intervals and use a switch:
switch(I) {
case Int1: //whatever
break;
...
default:
}
I don't know, it's the first thing that I thought of.
EDIT Switch is more efficient than if-else if your numbers are within a relative small interval (that's something to take into account when doing the mapping)
If your x-coordinates must be irregularly spaced, then store the x-coordinates in sorted order, and use a binary search to find the nearest coordinate, for example using Daniel Fleischman's answer.
However, if your problem permits it, consider pre-interpolating to regularly spaced data. So
5 15
7 18
10 22
becomes
5 15
6 16.5
7 18
8 19.3333333
9 20.6666667
10 22
Then at run-time you can interpolate with O(1) using something like this:
double interp1( double x0, double dx, double* y, int n, double xi )
{
double f = ( xi - x0 ) / dx;
if (f<0) return y[0];
if (f>=(n-1)) return y[n-1];
int i = (int) f;
double w = f-(double)i;
return dy[i]*(1.0-w) + dy[i+1]*w;
}
using
double y[6] = {15,16.5,18,19.3333333, 20.6666667, 22 }
double yi = interp1( 5.0 , 1.0 , y, 5, xi );
This isn't necessarily suitable for every problem -- you could end up losing accuracy (if there's no nice grid that contains all your x-samples), and it could have a bad cache penalty if it would make your table much much bigger. But it's a good option for cases where you have some control over the x-coordinates to begin with.
How you've already got it is fairly readable and understandable, and there's a lot to be said for that over a "clever" solution. You can however do away with the lower bounds check and clumsy && because the sequence is ordered:
if (x < 5)
return 0;
else if (x <= 7)
// interpolate
else if (x <= 10)
// interpolate
...

How i can make matlab precision to be the same as in c++?

I have problem with precision. I have to make my c++ code to have same precision as matlab. In matlab i have script which do some stuff with numbers etc. I got code in c++ which do the same as that script. Output on the same input is diffrent :( I found that in my script when i try 104 >= 104 it returns false. I tried to use format long but it did not help me to find out why its false. Both numbers are type of double. i thought that maybe matlab stores somewhere the real value of 104 and its for real like 103.9999... So i leveled up my precision in c++. It also didnt help because when matlab returns me value of 50.000 in c++ i got value of 50.050 with high precision. Those 2 values are from few calculations like + or *. Is there any way to make my c++ and matlab scrips have same precision?
for i = 1:neighbors
y = spoints(i,1)+origy;
x = spoints(i,2)+origx;
% Calculate floors, ceils and rounds for the x and y.
fy = floor(y); cy = ceil(y); ry = round(y);
fx = floor(x); cx = ceil(x); rx = round(x);
% Check if interpolation is needed.
if (abs(x - rx) < 1e-6) && (abs(y - ry) < 1e-6)
% Interpolation is not needed, use original datatypes
N = image(ry:ry+dy,rx:rx+dx);
D = N >= C;
else
% Interpolation needed, use double type images
ty = y - fy;
tx = x - fx;
% Calculate the interpolation weights.
w1 = (1 - tx) * (1 - ty);
w2 = tx * (1 - ty);
w3 = (1 - tx) * ty ;
w4 = tx * ty ;
%Compute interpolated pixel values
N = w1*d_image(fy:fy+dy,fx:fx+dx) + w2*d_image(fy:fy+dy,cx:cx+dx) + ...
w3*d_image(cy:cy+dy,fx:fx+dx) + w4*d_image(cy:cy+dy,cx:cx+dx);
D = N >= d_C;
end
I got problems in else which is in line 12. tx and ty eqauls 0.707106781186547 or 1 - 0.707106781186547. Values from d_image are in range 0 and 255. N is value 0..255 of interpolating 4 pixels from image. d_C is value 0.255. Still dunno why matlab shows that when i have in N vlaues like: x x x 140.0000 140.0000 and in d_C: x x x 140 x. D gives me 0 on 4th position so 140.0000 != 140. I Debugged it trying more precision but it still says that its 140.00000000000000 and it is still not 140.
int Codes::Interpolation( Point_<int> point, Point_<int> center , Mat *mat)
{
int x = center.x-point.x;
int y = center.y-point.y;
Point_<double> my;
if(x<0)
{
if(y<0)
{
my.x=center.x+LEN;
my.y=center.y+LEN;
}
else
{
my.x=center.x+LEN;
my.y=center.y-LEN;
}
}
else
{
if(y<0)
{
my.x=center.x-LEN;
my.y=center.y+LEN;
}
else
{
my.x=center.x-LEN;
my.y=center.y-LEN;
}
}
int a=my.x;
int b=my.y;
double tx = my.x - a;
double ty = my.y - b;
double wage[4];
wage[0] = (1 - tx) * (1 - ty);
wage[1] = tx * (1 - ty);
wage[2] = (1 - tx) * ty ;
wage[3] = tx * ty ;
int values[4];
//wpisanie do tablicy 4 pixeli ktore wchodza do interpolacji
for(int i=0;i<4;i++)
{
int val = mat->at<uchar>(Point_<int>(a+help[i].x,a+help[i].y));
values[i]=val;
}
double moze = (wage[0]) * (values[0]) + (wage[1]) * (values[1]) + (wage[2]) * (values[2]) + (wage[3]) * (values[3]);
return moze;
}
LEN = 0.707106781186547 Values in array values are 100% same as matlab values.
Matlab uses double precision. You can use C++'s double type. That should make most things similar, but not 100%.
As someone else noted, this is probably not the source of your problem. Either there is a difference in the algorithms, or it might be something like a library function defined differently in Matlab and in C++. For example, Matlab's std() divides by (n-1) and your code may divide by n.
First, as a rule of thumb, it is never a good idea to compare floating point variables directly. Instead of, for example instead of if (nr >= 104) you should use if (nr >= 104-e), where e is a small number, like 0.00001.
However, there must be some serious undersampling or rounding error somewhere in your script, because getting 50050 instead of 50000 is not in the limit of common floating point imprecision. For example, Matlab can have a step of as small as 15 digits!
I guess there are some casting problems in your code, for example
int i;
double d;
// ...
d = i/3 * d;
will will give a very inaccurate result, because you have an integer division. d = (double)i/3 * d or d = i/3. * d would give a much more accurate result.
The above example would NOT cause any problems in Matlab, because there everything is already a floating-point number by default, so a similar problem might be behind the differences in the results of the c++ and Matlab code.
Seeing your calculations would help a lot in finding what went wrong.
EDIT:
In c and c++, if you compare a double with an integer of the same value, you have a very high chance that they will not be equal. It's the same with two doubles, but you might get lucky if you perform the exact same computations on them. Even in Matlab it's dangerous, and maybe you were just lucky that as both are doubles, both got truncated the same way.
By you recent edit it seems, that the problem is where you evaluate your array. You should never use == or != when comparing floats or doubles in c++ (or in any languages when you use floating-point variables). The proper way to do a comparison is to check whether they are within a small distance of each other.
An example: using == or != to compare two doubles is like comparing the weight of two objects by counting the number of atoms in them, and deciding that they are not equal even if there is one single atom difference between them.
MATLAB uses double precision unless you say otherwise. Any differences you see with an identical implementation in C++ will be due to floating-point errors.