How to avoid overflow in expression (A - B * C) / D?

How to avoid overflow in expression (A - B * C) / D? - c++

I need to compute an expression that looks like: (A - B * C) / D, where their types are: signed long long int A, B, C, D; Each number can be really big (not overflowing its type). While B * C could cause overflow, at same time expression (A - B * C) / D would fit in. How can I compute it correctly?
For example: In the equation (Ax + By = C), assuming A * x is overflowing, but y = (C - A * x) / B could fit in. No need to use BigInteger or double data type.

You can transform the equation to do the division first while accounting for the remainders:
Assume / is integer division and everything is infinite precision:
x == x / y * y + x % y
(A - B * C) / D
((A/D * D + (A%D)) - (B/D * D + (B%D)) * (C/D * D + (C%D))) / D
(A/D * D - B/D * D * C/D * D - (B/D * D * (C%D) + (B%D) * C/D * D) + (A%D) - (B%D) * (C%D)) / D
(A/D * D - B/D * D * C/D * D) / D - (B/D * D * (C%D) + (B%D) * C/D * D) / D + ((A%D) - (B%D) * (C%D)) / D
(A/D - B/D * C/D * D) - (B/D * (C%D) + (B%D) * C/D) + ((A%D) - (B%D) * (C%D)) / D
A/D - B/D * C - B/D * (C%D) - (B%D) * C/D) + ((A%D) - (B%D) * (C%D)) / D
Assuming D is not too small and not too big then x / D and x % D are small and we can do this:
using T = signed long long int;
T compute(T a, T b, T c, T d) {
T a1 = a / d, a2 = a % d;
T b1 = b / d, b2 = b % d;
T c1 = c / d, c2 = c % d;
T m1 = b1 * c, m2 = b1 * c2, m3 = b2 * c1, m4 = b2 * c2;
T s1 = a1 - m1 - m2 - m3, s2 = a2 - m4;
return s1 + s2 / d;
}
The critical part is the multiplication for m1 through m4. The range of numbers b and c that overflow while the result should have fit is rather small I believe.

I think you could change the order of the operations so it will look like:
A/D - (B/D)*C
The result should remain the same.

Since you mentioned gcd lets try that as alternative answer:
using T = signed long long int;
T gcd(T a, T b);
T compute(T a, T b, T c, T d) {
// use gcd for (b * c) / d
T db = gcd(b, d);
T dc = gcd(c, d / db);
T dm = db * dc;
// use gcd on a for (a - b*c) / dm
T da = gcd(a, dm);
// split da for use on (b * c) / da
db = gcd(b, da);
dc = da / db;
T dr = d / da;
return ((a / da) - (b / db) * (c / dc)) / dr;
}
Only works if d has enough factors in common with a, b and c.

Related

Mathematica vs. C++ calculation result

I have two programs that should be identical but are giving different results. One in Mathematica (giving the correct result) and one in C++ (incorrect).
First the Mathematica:
q = 0.002344;
s = 0.0266;
v = 0.0744;
a = -q*PDCx^2;
b = s*PDCx - 2*q*PCLx*PDCx - PDCz;
c = -1*(PCLz + q*PCLx^2 - s*PCLx + v);
d = b*b - (4*a*c);
t = (-b + Sqrt[d])/(2*a)
Now the C++:
long double q = 0.002344;
long double s = 0.0266;
long double v = 0.0744;
long double a = -q * pow(PDCx, 2);
long double b = s * PDCx - 2 * q*PCLx*PDCx - PDCz;
long double c = (-1.0)*(PCLz + q * pow(PCLx, 2) - s * PCLx + v);
long double d = b * b - 4.0 * a*c;
t = (-b + sqrtf(d))/(2.0*a);
with
long double PCLx = -1.816017;
long double PCLz = 0.056013;
long double PDCx = 0.005073;
long double PDCz = -0.998134;
for each case. The Mathematica result is t = 0.1867646081 and C++ result is t = 0.124776. This is the "plus" solution of the quadratic. The minus solutions differ are 16549276.47723365 and 16549276.539223, respectively. I suspect that I am allowing the C++ result to be rounded incorrectly.

C++ What is wrong with this version of the quadratic formula?

In my book, it asks me the following question This is for compsci-1
What is wrong with this version of the quadratic formula?
x1 = (-b - sqrt(b * b - 4 * a * c)) / 2 * a;
x2 = (-b + sqrt(b * b - 4 * a * c)) / 2 * a;

The equation your code is translating is:
which of course is not the solution for quadratic equations. You want a solution for this equation:
What's the difference? In the first one you compute the numerator, then you divide by two, then you multiply by a. That's what your code is doing. In the second one you compute the numerator, then you compute the denominator, finally you divide them.
So with additional variables:
num1 = -b - sqrt(b * b - 4 * a * c);
num2 = -b + sqrt(b * b - 4 * a * c);
den = 2 * a;
x1 = num1 / den;
x2 = num2 / den;
which can of course be written as:
x1 = (-b - sqrt(b * b - 4 * a * c)) / (2 * a);
x2 = (-b + sqrt(b * b - 4 * a * c)) / (2 * a);
Where you have to plug in those parenthesis in order to force the denominator to be computed before the division. As suggested in the comment by #atru.

Wrong conversion using macro

So I have a weird problem with a GNU GCC (C/C++) macro defined as follows:
#define PI 3.14159265359
#define DEG_TO_RAD(a) (a * PI / 180.0)
#define ARCSEC_TO_DEG(a) (a / 3600.0)
#define ARCSEC_TO_RAD(a) DEG_TO_RAD( ARCSEC_TO_DEG( a ) )
The macro, as you can tell, is simply converting a value in seconds of arc to radians. However, depending on where the macro is applied, I get a different result:
double xi2 = ARCSEC_TO_RAD( 2306.2181 * c + 0.30188 * c2 + 0.017998 * c3);
double xi = 2306.2181 * c + 0.30188 * c2 + 0.017998 * c3;
printf("c = %.10f; xi = %.10f = %.10f = %.10f; ",
c, xi, ARCSEC_TO_RAD(xi), xi2);
This outputs:
c = 0.1899931554; xi = 438.1766743152 = 0.0021243405 = 7.6476237313;
Where's the silly error...?

Going step by step,
ARCSEC_TO_RAD( 2306.2181 * c + 0.30188 * c2 + 0.017998 * c3);
will expand to
DEG_TO_RAD( ARCSEC_TO_DEG(2306.2181 * c + 0.30188 * c2 + 0.017998 * c3))
DEG_TO_RAD( (2306.2181 * c + 0.30188 * c2 + 0.017998 * c3 / 3600.0))
((2306.2181 * c + 0.30188 * c2 + 0.017998 * c3 / 3600.0) * P* / 180.0)
Now the regular order of operations kick in here, so 2306.2181 * c + 0.30188 * c2 + 0.017998 * c3 will not be divided by 3600. Only 0.017998 * c3 will. The old school C solution is to place brackets around the macro substitutions.
The modern C and C++ solutions are to use functions. inline the functions if you need to to meet ODR, but the compiler will likely decide on its own whether the function should be expanded inline or not.
This question is tagged C++, so here's the C++ solution:
#include <iostream>
constexpr double PI = 3.14159265359;
/* or
#include <cmath>
const double PI = std::acos(-1);
but I'm not certain you can properly constexpr this */
double DEG_TO_RAD(double a)
{
return a * PI / 180.0;
}
double ARCSEC_TO_DEG(double a)
{
return a / 3600.0;
}
double ARCSEC_TO_RAD(double a)
{
return DEG_TO_RAD( ARCSEC_TO_DEG( a ) );
}
int main ()
{
double c = 10;
double c2 = 20;
double c3 = 30;
std::cout << DEG_TO_RAD(2306.2181 * c + 0.30188 * c2 + 0.017998 * c3) << std::endl;
}
In C++11 or more recent, constexpr can be added to make these former macros compile-time constants should it be necessary.

I strogly recomend you to use functions(maybe inline), instead of MACROS,
but if for some reason you can't, a workaround could be adding parenthesis to received arguments:
#define PI 3.14159265359
#define DEG_TO_RAD(a) ((a) * PI / 180.0)
#define ARCSEC_TO_DEG(a) ((a) / 3600.0)
#define ARCSEC_TO_RAD(a) DEG_TO_RAD( ARCSEC_TO_DEG( (a) ) )
//In the lastone () is not necessary but it a good prectice always adding parenthesis to macro args
This prevent you to have errors related to operator precedence when the macro is expanded.

Clojure head retention

I'm reading Clojure Programming book by O'Reilly..
I came across an example of head retention.
First example retains reference to d (I presume), so it doesn't get garbage collected:
(let [[t d] (split-with #(< % 12) (range 1e8))]
[(count d) (count t)])
;= #<OutOfMemoryError java.lang.OutOfMemoryError: Java heap space>
While the second example doesn't retain it, so it goes with no problem:
(let [[t d] (split-with #(< % 12) (range 1e8))]
[(count t) (count d)])
;= [12 99999988]
What I don't get here is what exactly is retained in which case and why.
If I try to return just [(count d)], like this:
(let [[t d] (split-with #(< % 12) (range 1e8))]
[(count d)])
it seems to create the same memory problem.
Further, I recall reading that count in every case realizes/evaluates a sequence.
So, I need that clarified.
If I try to return (count t) first, how is that faster/more memory efficient than if I don't return it at all?
And what & why gets retained in which case?

In both the first and the final examples the original sequence passed to split-with is retained while being realized in full in memory; hence the OOME. The way this happens is indirect; what is retained directly is t, while the original sequence is being held onto by t, a lazy seq, in its unrealized state.
The way t causes the original sequence to be held is as follows. Prior to being realized, t is a LazySeq object storing a thunk which may be called upon at some point to realize t; this thunk needs to store a pointer to the original sequence argument to split-with before it is realized to pass it on to take-while -- see the implementation of split-with. Once t is realized, the thunk becomes eligible for GC (the field which holds it in the LazySeq object is set to null) at t no longer holds the head of the huge input seq.
The input seq itself is being realized in full by (count d), which needs to realize d, and thus the original input seq.
Moving on to why t is being retained:
In the first case, this is because (count d) gets evaluated before (count t). Since Clojure evaluates these expressions left to right, the local t needs to hang around for the second call to count, and since it happens to hold on to a huge seq (as explained above), that leads to the OOME.
The final example where only (count d) is returned should ideally not hold on to t; the reason that is not the case is somewhat subtle and best explained by referring to the second example.
The second example happens to work fine, because after (count t) is evaluated, t is no longer needed. The Clojure compiler notices this and uses a clever trick to have the local reset to nil simultaneously with the count call being made. The crucial piece of Java code does something like f(t, t=null), so that the current value of t is passed to the appropriate function, but the local is cleared before control is handed over to f, since this happens as a side effect of the expression t=null which is an argument to f; clearly here Java's left-to-right semantics are key to this working.
Back to the final example, this doesn't work, because t is not actually used anywhere and unused locals are not handled by the locals clearing process. (The clearing happens at the point of last use; in absence of such a point in the program, there is no clearing.)
As for count realizing lazy sequences: it must do that, as there is no general way of predicting the length of a lazy seq without realizing it.

Answer by #Michał Marczyk, while correct, is a little difficult to comprehend. I find this post on Google Groups easier to grasp.
Here's how I understand it:
Step 1 Create lazy sequence: (range 1e8). Values are not realized yet, I marked them as asterixes (*):
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ... * * *
Step 2 Create two more lazy seqences which are "windows" through which you look at the original, huge lazy sequence. First window contains only 12 elements (t), the other the rest of elements (d):
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ... * * *
t t t t t t t t t t t t t d d d d d d d d d d d d d d d d d ... d d d
Step 3 - out of memory scenario - you evaluate [(count d) (count t)]. So, first you count elements in d, then in t. What will happen is that you will go through all values starting at the first element of d and realize them (marked as !):
* * * * * * * * * * * * * ! * * * * * * * * * * * * * * * * ... * * *
t t t t t t t t t t t t t d d d d d d d d d d d d d d d d d ... d d d
^
start here and move right ->
* * * * * * * * * * * * * ! ! * * * * * * * * * * * * * * * ... * * *
t t t t t t t t t t t t t d d d d d d d d d d d d d d d d d ... d d d
^
* * * * * * * * * * * * * ! ! ! * * * * * * * * * * * * * * ... * * *
t t t t t t t t t t t t t d d d d d d d d d d d d d d d d d ... d d d
^
...
; this is theoretical end of counting process which will never happen
; because of OutOfMemoryError
* * * * * * * * * * * * * ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ... ! ! !
t t t t t t t t t t t t t d d d d d d d d d d d d d d d d d ... d d d
^
Problem is that all the realized values (!) are being retained, because the head of the collection (first 12 elements) are still needed - we still need to evaluate (count t). This consumes a lot of memory causing JVM to crash.
Step 3 - valid scenario - this time you evaluate [(count t) (count d)]. So we first want to count elements in smaller, head sequence:
! * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ... * * *
t t t t t t t t t t t t t d d d d d d d d d d d d d d d d d ... d d d
^
start here and move right ->
! * * * * * * * * * * * * * * * * * ... * * *
t t t t t t t t t t t t t d d d d d d d d d d d d d d d d d ... d d d
^
Then, we count elements in d sequence. Compiler knows that elements from t aren't needed anymore, so it can garbage collect them freeing up the memory:
! * * * * * * * * * * * * * * * * ... * * *
t t t t t t t t t t t t t d d d d d d d d d d d d d d d d d ... d d d
^
! * * * * * * * * * * * * * * * ... * * *
t t t t t t t t t t t t t d d d d d d d d d d d d d d d d d ... d d d
^
...
... !
t t t t t t t t t t t t t d d d d d d d d d d d d d d d d d ... d d d
^
Now we can see that, because elements from t weren't needed anymore, compiler was able to clear memory as it went through the large sequence.

An important addition to the final example:
(let [[t d] (split-with #(< % 12) (range 1e8))]
[(count d)])
Back to the final example, this doesn't work, because t is not actually used anywhere and unused locals are not handled by the locals clearing process.
It's not the case anymore. Since Clojure 1.9 unused destructured locals are cleared. See CLJ-1744 for more details.

Converting Lab Values to RGB values in opencv

I am trying to convert the Lab values to its corresponding RGB values.I don't want to convert Lab image to RGB image but some values of L a and b.The function cvCvtColor only works for images.Can anybody tell me how to do this.
Thanks;
Code :
CvMat* rgb = cvCreateMat(centres->rows,centres->cols,centres->type);
cvCvtColor(centres,rgb,CV_Lab2BGR);

I don't know how to do it in OpenCV, but if something else is alright I've implemented it in C. See function color_Lab_to_LinearRGB and color_LinearRGB_to_RGB.
Here's the code:
double L, a, b;
double X, Y, Z;
double R, G, B;
// Lab -> normalized XYZ (X,Y,Z are all in 0...1)
Y = L * (1.0/116.0) + 16.0/116.0;
X = a * (1.0/500.0) + Y;
Z = b * (-1.0/200.0) + Y;
X = X > 6.0/29.0 ? X * X * X : X * (108.0/841.0) - 432.0/24389.0;
Y = L > 8.0 ? Y * Y * Y : L * (27.0/24389.0);
Z = Z > 6.0/29.0 ? Z * Z * Z : Z * (108.0/841.0) - 432.0/24389.0;
// normalized XYZ -> linear sRGB (in 0...1)
R = X * (1219569.0/395920.0) + Y * (-608687.0/395920.0) + Z * (-107481.0/197960.0);
G = X * (-80960619.0/87888100.0) + Y * (82435961.0/43944050.0) + Z * (3976797.0/87888100.0);
B = X * (93813.0/1774030.0) + Y * (-180961.0/887015.0) + Z * (107481.0/93370.0);
// linear sRGB -> gamma-compressed sRGB (in 0...1)
R = R > 0.0031308 ? pow(R, 1.0 / 2.4) * 1.055 - 0.055 : R * 12.92;
G = G > 0.0031308 ? pow(G, 1.0 / 2.4) * 1.055 - 0.055 : G * 12.92;
B = B > 0.0031308 ? pow(B, 1.0 / 2.4) * 1.055 - 0.055 : B * 12.92;

I think the only way to do what you want is:
look at the OpenCV source code or documentation for cvCvtColor and implement the equations yourself (they're all explicitly given on that page)
create a dummy image with the few Lab values you want to convert (say 1 by n by 3, where n is the number of values you want to convert) and use cvCvtColor around that.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js