Related
The following C++ template detects overflows from multiplying two unsigned integers.
template<typename UInt> UInt safe_multiply(UInt a, UInt b) {
UInt x = a * b; // x := ab mod n, for n := 2^#bits > 0
if (a != 0 && x / a != b)
cerr << "Overflow for " << a << " * " << b << "." << endl;
return x;
}
Can you give a proof that this algorithm detects every potential overflow, regardless of how many bits UInt uses?
The case
cannot result in overflows, so we can consider
.
It seems that the correctness proof boils down to leading
to a contradiction, since x / a actually means .
When assuming
, this leads to the straightforward consequence
thus
which contradicts n > 0.
So it remains to show
or there must be another way.
If the last equation is true, WolframAlpha fails to confirm that (also with exponents).
However, it asserts that the original assumptions have no integer solutions, so the algorithms seems to be correct indeed.
But it doesn't provide an explanation. So why is it correct?
I am looking for the smallest possible explanation that is still mathematically profound, ideally that it fits in a single-line comment. Maybe I am missing something trivial, or the problem is not as easy as it looks.
On a side note, I used Codecogs Equation Editor for the LaTeX markup images, which apparently looks bad in dark mode, so consider switching to light mode or, if you know, please tell me how to use different images depending on the client settings. It is just \bg{white} vs. \bg{black} as part of the image URLs.
To be clear, I'll use the multiplication and division symbols (*, /) mathematically.
Also, for convenience let's name the set N = {0, 1, ..., n - 1}.
Let's clear up what unsigned multiplication is:
Unsigned multiplication for some magnitude, n, is a modular n operation on unsigned-n inputs (inputs that are in N) that results in an unsigned-n output (ie. also in N).
In other words, the result of unsigned multiplication, x, is x = a*b (mod n), and, additionally, we know that x,a,b are in N.
It's important to be able to expand many modular formulas like so: x = a*b - k*n, where k is an integer - but in our case x,a,b are in N so this implies that k is in N.
Now, let's mathematically say what we're trying to prove:
Given positive integers, a,n, and non-negative integers x,b, where x,a,b are in N, and x = a*b (mod n), then a*b >= n (overflow) implies floor(x/a) != b.
Proof:
If overflow (a*b >= n) then x >= n - k*n = (1 - k)*n (for k in N),
As x < n then (1 - k)*n < n, so k > 0.
This means x <= a*b - n.
So, floor(x/a) <= floor([a*b - n]/a) = floor(a*b/a - n/a) = b - floor(n/a) <= b - 1,
Abbreviated: floor(x/a) <= b - 1
Therefore floor(x/a) != b
The multiplication gives either the mathematically correct result, or it is off by some multiple of 2^64. Since you check for a=0, the division always gives the correct result for its input. But in the case of overflow, the input is off by 2^64 or more, so the test will fail as you hoped.
The last bit is that unsigned operations don’t have undefined behaviour except for division by zero, so your code is fine.
I came across this modulo multiplication function in a code for the miller-rabin primality test. This is supposed to eliminate the integer overflow that occurs when calculating ( a * b ) % m.
I need some help in understanding what is going on here. Why does this work? and what is the significance of that number literal 0x8000000000000000ULL?
unsigned long long mul_mod(unsigned long long a, unsigned long long b, unsigned long long m) {
unsigned long long d = 0, mp2 = m >> 1;
if (a >= m) a %= m;
if (b >= m) b %= m;
for (int i = 0; i < 64; i++)
{
d = (d > mp2) ? (d << 1) - m : d << 1;
if (a & 0x8000000000000000ULL)
d += b;
if (d >= m) d -= m;
a <<= 1;
}
return d;
}
This code, which currently appears on the modular arithmetic Wikipedia page, only works for arguments of up to 63 bits -- see bottom.
Overview
One way to compute an ordinary multiplication a * b is to add left-shifted copies of b -- one for each 1-bit in a. This is similar to how most of us did long multiplication in school, but simplified: Since we only ever need to "multiply" each copy of b by 1 or 0, all we need to do is either add the shifted copy of b (when the corresponding bit of a is 1) or do nothing (when it's 0).
This code does something similar. However, to avoid overflow (mostly; see below), instead of shifting each copy of b and then adding it to the total, it adds an unshifted copy of b to the total, and relies on later left-shifts performed on the total to shift it into the correct place. You can think of these shifts "acting on" all the summands added to the total so far. For example, the first loop iteration checks whether the highest bit of a, namely bit 63, is 1 (that's what a & 0x8000000000000000ULL does), and if so adds an unshifted copy of b to the total; by the time the loop completes, the previous line of code will have shifted the total d left 1 bit 63 more times.
The main advantage of doing it this way is that we are always adding two numbers (namely b and d) that we already know are less than m, so handling the modulo wraparound is cheap: We know that b + d < 2 * m, so to ensure that our total so far remains less than m, it suffices to check whether b + d < m, and if not, subtract m. If we were to use the shift-then-add approach instead, we would need a % modulo operation per bit, which is as expensive as division -- and usually much more expensive than subtraction.
One of the properties of modulo arithmetic is that, whenever we want to perform a sequence of arithmetic operations modulo some number m, performing them all in usual arithmetic and taking the remainder modulo m at the end always yields the same result as taking remainders modulo m for each intermediate result (provided no overflows occur).
Code
Before the first line of the loop body, we have the invariants d < m and b < m.
The line
d = (d > mp2) ? (d << 1) - m : d << 1;
is a careful way of shifting the total d left by 1 bit, while keeping it in the range 0 .. m and avoiding overflow. Instead of first shifting it and then testing whether the result is m or greater, we test whether it is currently strictly above RoundDown(m/2) -- because if so, after doubling, it will surely be strictly above 2 * RoundDown(m/2) >= m - 1, and so require a subtraction of m to get back in range. Note that even though the (d << 1) in (d << 1) - m may overflow and lose the top bit of d, this does no harm as it does not affect the lowest 64 bits of the subtraction result, which are the only ones we are interested in. (Also note that if d == m/2 exactly, we wind up with d == m afterward, which is slightly out of range -- but changing the test from d > mp2 to d >= mp2 to fix this would break the case where m is odd and d == RoundDown(m/2), so we have to live with this. It doesn't matter, because it will be fixed up below.)
Why not simply write d <<= 1; if (d >= m) d -= m; instead? Suppose that, in infinite-precision arithmetic, d << 1 >= m, so we should perform the subtraction -- but the high bit of d is on and the rest of d << 1 is less than m: In this case, the initial shift will lose the high bit and the if will fail to execute.
Restriction to inputs of 63 bits or fewer
The above edge case can only occur when d's high bit is on, which can only occur when m's high bit is also on (since we maintain the invariant d < m). So it looks like the code is taking pains to work correctly even with very high values of m. Unfortunately, it turns out that it can still overflow elsewhere, resulting in incorrect answers for some inputs that set the top bit. For example, when a = 3, b = 0x7FFFFFFFFFFFFFFFULL and m = 0xFFFFFFFFFFFFFFFFULL, the correct answer should be 0x7FFFFFFFFFFFFFFEULL, but the code will return 0x7FFFFFFFFFFFFFFDULL (an easy way to see the correct answer is to rerun with the values of a and b swapped). Specifically, this behaviour occurs whenever the line d += b overflows and leaves the truncated d less than m, causing a subtraction to be erroneously skipped.
Provided this behaviour is documented (as it is on the Wikipedia page), this is just a limitation, not a bug.
Removing the restriction
If we replace the lines
if (a & 0x8000000000000000ULL)
d += b;
if (d >= m) d -= m;
with
unsigned long long x = -(a >> 63) & b;
if (d >= m - x) d -= m;
d += x;
the code will work for all inputs, including those with top bits set. The cryptic first line is just a conditional-free (and thus usually faster) way of writing
unsigned long long x = (a & 0x8000000000000000ULL) ? b : 0;
The test d >= m - x operates on d before it has been modified -- it's like the old d >= m test, but b (when the top bit of a is on) or 0 (otherwise) has been subtracted from both sides. This tests whether d would be m or larger once x is added to it. We know that the RHS m - x never underflows, because the largest x can be is b and we have established that b < m at the top of the function.
Here is a function, which expressed in C is:
uint32_t f(uint32_t x) {
return (x * 0x156) ^ 0xfca802c7;
}
Then I came across a challenge: How to find all its fixed points?
I know we can test every uint32_t value to solve this problem, but I still want to know if there is another way that is more elegant - especially when uint32_t becomes uint64_t and (0x156, 0xfca802c7) is an arbitrary pair of values.
Python code:
def f(x, n):
return ((x*0x156)^0xfca802c7) % n
solns = [1] # The one solution modulo 2, see text for explanation
n = 1
while n < 2**32:
prev_n = n
n = n * 2
lifted_solns = []
for soln in solns:
if f(soln, n) == soln:
lifted_solns.append(soln)
if f(soln + prev_n, n) == soln + prev_n:
lifted_solns.append(soln + prev_n)
solns = lifted_solns
for soln in solns:
print soln, "evaluates to ", f(soln, 2**32)
Output: 150129329 evaluates to 150129329
Idea behind the algorithm: We are trying to find x XOR 0xfca802c7 = x*0x156 modulo n, where in our case n=2^32. I wrote it this way because the right side is a simple modular multiplication that behaves nicely with the left side.
The main property we are going to use is that a solution to x XOR 0xfca802c7 = x*0x156 modulo 2^(i+1) reduces to a solution to x XOR 0xfca802c7 = x*0x156 modulo 2^i. Another way of saying that is that a solution to x XOR 0xfca802c7 = x*0x156 modulo 2^i translates to one or two solutions modulo 2^(i+1): those possibilities are either x and/or x+2^i (if we want to be more precise, we are only looking at integers between 0, ..., modulus size - 1 when we say "solution").
We can easily solve this for i=1: x XOR 0xfca802c7 = x*0x156 modulo 2^1 is the same as x XOR 1 = x*0 mod 2, which means x=1 is the only solution. From there we know that only 1 and 3 are the possible solutions modulo 2^2 = 4. So we only have two to try. It turns out that only one works. That's our current solution modulo 4. We can then lift that solution to the possibilities modulo 8. And so on. Eventually we get all such solutions.
Remark1: This code finds all solutions. In this case, there is only one, but for more general parameters there may be more than one.
Remark2: the running time is O(max[number of solutions, modulus size in bits]), assuming I have not made an error. So it is fast unless there are many, many fixed points. In this case, there seems to only be one.
Let's use Z3 solver:
(declare-const x (_ BitVec 32))
(assert (= x (bvxor (bvmul x #x00000156) #xfca802c7)))
(check-sat)
(get-model)
The result is '#x08f2cab1' = 150129329.
Since input bits at position n only affect output bits at positions ≥ n you know that you can find a solution by choosing the first bit, then the second bit, etc.
Here is how you could solve it in C++ for 64-bit integers (of course it also works with 32-bit integers):
#include <cstdint>
#include <cstdio>
uint64_t f(uint64_t x) {
return (x * 0x7ef93a76ULL) ^ 0x3550e08f8a9c89c7ULL;
}
static void search(uint64_t x, uint64_t bit)
{
if (bit == 0)
{
printf("Fixed point: 0x%llx\n", (long long unsigned)x);
return;
}
if (f(x + bit) & bit) search(x + bit, bit << 1);
if ((f(x) & bit) == 0) search(x, bit << 1);
}
int main()
{
search(0x0, 1);
}
With this output:
Fixed point: 0xb9642f1d99863811
i know that if number is power of two,then it must satisfy (x&(x-1))=0; for example let's take x=16 or 10000 x-16=10000-1=01111 and (x&(x-1))=0; for another non power number 7 for example, 7=0111,7-1=0110 7&(7-1)=0110 which is not equal 0,my question how can i determine if number is some power of another number k? for example 625 is 5^4,and also how can i find in which power is equal k to n?i am interested using bitwise operators,sure i can find it by brute force methods(by standard algorithm,thanks a lot
I doubt you're going to find a bitwise algorithm for determining that a number is a power of 5.
In general, given y = n^x, to find x, you need to use logarithms, i.e. x = log_n(y). Most languages don't offer a log_n function, but you can achieve it with the following identity:
log_n(y) = log(y) / log(n)
If y is an integer power of n, then x will be an integer. Of course, due to the limitations of finite-precision computer arithmetic, you won't necessarily get the exact answer with the method above.
I'm afraid, you can't do that with just simple bit magic. Bits are typically good for powers of 2. For powers of, say, 5 you'd probably need to operate in base-5 system, where 15=110, 105=510, 1005=2510, 10005=12510, 100005=62510, etc. In base-5 system you can recognize powers of 5 just as easily as powers of 2 in binary. But you'd first need to convert your numbers to that base.
For arbitrary k there is only the generic solution:
bool is_pow(unsigned long x, unsigned int base) {
assert(base >= 2);
if (x == 0) {
return false;
}
unsigned long t = x;
while (t % base == 0) {
t /= base;
}
return t == 1;
}
When k is a power of two, you can speed things up by checking whether x is a power of two and whether the number of trailing zero bits of x is divisible by log2(k).
And if computational speed is important and your k is fixed, you can always use the trivial implementation:
bool is_pow5(unsigned long x) {
if (x == 5 || x == 25 || x == 125 || x == 625)
return true;
if (x < 3125)
return false;
// you got the idea
...
}
I was always wondering how I can make a function which calculates the power (e.g. 23) myself. In most languages these are included in the standard library, mostly as pow(double x, double y), but how can I write it myself?
I was thinking about for loops, but it think my brain got in a loop (when I wanted to do a power with a non-integer exponent, like 54.5 or negatives 2-21) and I went crazy ;)
So, how can I write a function which calculates the power of a real number? Thanks
Oh, maybe important to note: I cannot use functions which use powers (e.g. exp), which would make this ultimately useless.
Negative powers are not a problem, they're just the inverse (1/x) of the positive power.
Floating point powers are just a little bit more complicated; as you know a fractional power is equivalent to a root (e.g. x^(1/2) == sqrt(x)) and you also know that multiplying powers with the same base is equivalent to add their exponents.
With all the above, you can:
Decompose the exponent in a integer part and a rational part.
Calculate the integer power with a loop (you can optimise it decomposing in factors and reusing partial calculations).
Calculate the root with any algorithm you like (any iterative approximation like bisection or Newton method could work).
Multiply the result.
If the exponent was negative, apply the inverse.
Example:
2^(-3.5) = (2^3 * 2^(1/2)))^-1 = 1 / (2*2*2 * sqrt(2))
AB = Log-1(Log(A)*B)
Edit: yes, this definition really does provide something useful. For example, on an x86, it translates almost directly to FYL2X (Y * Log2(X)) and F2XM1 (2x-1):
fyl2x
fld st(0)
frndint
fsubr st(1),st
fxch st(1)
fchs
f2xmi
fld1
faddp st(1),st
fscale
fstp st(1)
The code ends up a little longer than you might expect, primarily because F2XM1 only works with numbers in the range -1.0..1.0. The fld st(0)/frndint/fsubr st(1),st piece subtracts off the integer part, so we're left with only the fraction. We apply F2XM1 to that, add the 1 back on, then use FSCALE to handle the integer part of the exponentiation.
Typically the implementation of the pow(double, double) function in math libraries is based on the identity:
pow(x,y) = pow(a, y * log_a(x))
Using this identity, you only need to know how to raise a single number a to an arbitrary exponent, and how to take a logarithm base a. You have effectively turned a complicated multi-variable function into a two functions of a single variable, and a multiplication, which is pretty easy to implement. The most commonly chosen values of a are e or 2 -- e because the e^x and log_e(1+x) have some very nice mathematical properties, and 2 because it has some nice properties for implementation in floating-point arithmetic.
The catch of doing it this way is that (if you want to get full accuracy) you need to compute the log_a(x) term (and its product with y) to higher accuracy than the floating-point representation of x and y. For example, if x and y are doubles, and you want to get a high accuracy result, you'll need to come up with some way to store intermediate results (and do arithmetic) in a higher-precision format. The Intel x87 format is a common choice, as are 64-bit integers (though if you really want a top-quality implementation, you'll need to do a couple of 96-bit integer computations, which are a little bit painful in some languages). It's much easier to deal with this if you implement powf(float,float), because then you can just use double for intermediate computations. I would recommend starting with that if you want to use this approach.
The algorithm that I outlined is not the only possible way to compute pow. It is merely the most suitable for delivering a high-speed result that satisfies a fixed a priori accuracy bound. It is less suitable in some other contexts, and is certainly much harder to implement than the repeated-square[root]-ing algorithm that some others have suggested.
If you want to try the repeated square[root] algorithm, begin by writing an unsigned integer power function that uses repeated squaring only. Once you have a good grasp on the algorithm for that reduced case, you will find it fairly straightforward to extend it to handle fractional exponents.
There are two distinct cases to deal with: Integer exponents and fractional exponents.
For integer exponents, you can use exponentiation by squaring.
def pow(base, exponent):
if exponent == 0:
return 1
elif exponent < 0:
return 1 / pow(base, -exponent)
elif exponent % 2 == 0:
half_pow = pow(base, exponent // 2)
return half_pow * half_pow
else:
return base * pow(base, exponent - 1)
The second "elif" is what distinguishes this from the naïve pow function. It allows the function to make O(log n) recursive calls instead of O(n).
For fractional exponents, you can use the identity a^b = C^(b*log_C(a)). It's convenient to take C=2, so a^b = 2^(b * log2(a)). This reduces the problem to writing functions for 2^x and log2(x).
The reason it's convenient to take C=2 is that floating-point numbers are stored in base-2 floating point. log2(a * 2^b) = log2(a) + b. This makes it easier to write your log2 function: You don't need to have it be accurate for every positive number, just on the interval [1, 2). Similarly, to calculate 2^x, you can multiply 2^(integer part of x) * 2^(fractional part of x). The first part is trivial to store in a floating point number, for the second part, you just need a 2^x function over the interval [0, 1).
The hard part is finding a good approximation of 2^x and log2(x). A simple approach is to use Taylor series.
Per definition:
a^b = exp(b ln(a))
where exp(x) = 1 + x + x^2/2 + x^3/3! + x^4/4! + x^5/5! + ...
where n! = 1 * 2 * ... * n.
In practice, you could store an array of the first 10 values of 1/n!, and then approximate
exp(x) = 1 + x + x^2/2 + x^3/3! + ... + x^10/10!
because 10! is a huge number, so 1/10! is very small (2.7557319224⋅10^-7).
Wolfram functions gives a wide variety of formulae for calculating powers. Some of them would be very straightforward to implement.
For positive integer powers, look at exponentiation by squaring and addition-chain exponentiation.
Using three self implemented functions iPow(x, n), Ln(x) and Exp(x), I'm able to compute fPow(x, a), x and a being doubles. Neither of the functions below use library functions, but just iteration.
Some explanation about functions implemented:
(1) iPow(x, n): x is double, n is int. This is a simple iteration, as n is an integer.
(2) Ln(x): This function uses the Taylor Series iteration. The series used in iteration is Σ (from int i = 0 to n) {(1 / (2 * i + 1)) * ((x - 1) / (x + 1)) ^ (2 * n + 1)}. The symbol ^ denotes the power function Pow(x, n) implemented in the 1st function, which uses simple iteration.
(3) Exp(x): This function, again, uses the Taylor Series iteration. The series used in iteration is Σ (from int i = 0 to n) {x^i / i!}. Here, the ^ denotes the power function, but it is not computed by calling the 1st Pow(x, n) function; instead it is implemented within the 3rd function, concurrently with the factorial, using d *= x / i. I felt I had to use this trick, because in this function, iteration takes some more steps relative to the other functions and the factorial (i!) overflows most of the time. In order to make sure the iteration does not overflow, the power function in this part is iterated concurrently with the factorial. This way, I overcame the overflow.
(4) fPow(x, a): x and a are both doubles. This function does nothing but just call the other three functions implemented above. The main idea in this function depends on some calculus: fPow(x, a) = Exp(a * Ln(x)). And now, I have all the functions iPow, Ln and Exp with iteration already.
n.b. I used a constant MAX_DELTA_DOUBLE in order to decide in which step to stop the iteration. I've set it to 1.0E-15, which seems reasonable for doubles. So, the iteration stops if (delta < MAX_DELTA_DOUBLE) If you need some more precision, you can use long double and decrease the constant value for MAX_DELTA_DOUBLE, to 1.0E-18 for example (1.0E-18 would be the minimum).
Here is the code, which works for me.
#define MAX_DELTA_DOUBLE 1.0E-15
#define EULERS_NUMBER 2.718281828459045
double MathAbs_Double (double x) {
return ((x >= 0) ? x : -x);
}
int MathAbs_Int (int x) {
return ((x >= 0) ? x : -x);
}
double MathPow_Double_Int(double x, int n) {
double ret;
if ((x == 1.0) || (n == 1)) {
ret = x;
} else if (n < 0) {
ret = 1.0 / MathPow_Double_Int(x, -n);
} else {
ret = 1.0;
while (n--) {
ret *= x;
}
}
return (ret);
}
double MathLn_Double(double x) {
double ret = 0.0, d;
if (x > 0) {
int n = 0;
do {
int a = 2 * n + 1;
d = (1.0 / a) * MathPow_Double_Int((x - 1) / (x + 1), a);
ret += d;
n++;
} while (MathAbs_Double(d) > MAX_DELTA_DOUBLE);
} else {
printf("\nerror: x < 0 in ln(x)\n");
exit(-1);
}
return (ret * 2);
}
double MathExp_Double(double x) {
double ret;
if (x == 1.0) {
ret = EULERS_NUMBER;
} else if (x < 0) {
ret = 1.0 / MathExp_Double(-x);
} else {
int n = 2;
double d;
ret = 1.0 + x;
do {
d = x;
for (int i = 2; i <= n; i++) {
d *= x / i;
}
ret += d;
n++;
} while (d > MAX_DELTA_DOUBLE);
}
return (ret);
}
double MathPow_Double_Double(double x, double a) {
double ret;
if ((x == 1.0) || (a == 1.0)) {
ret = x;
} else if (a < 0) {
ret = 1.0 / MathPow_Double_Double(x, -a);
} else {
ret = MathExp_Double(a * MathLn_Double(x));
}
return (ret);
}
It's an interesting exercise. Here's some suggestions, which you should try in this order:
Use a loop.
Use recursion (not better, but interesting none the less)
Optimize your recursion vastly by using divide-and-conquer
techniques
Use logarithms
You can found the pow function like this:
static double pows (double p_nombre, double p_puissance)
{
double nombre = p_nombre;
double i=0;
for(i=0; i < (p_puissance-1);i++){
nombre = nombre * p_nombre;
}
return (nombre);
}
You can found the floor function like this:
static double floors(double p_nomber)
{
double x = p_nomber;
long partent = (long) x;
if (x<0)
{
return (partent-1);
}
else
{
return (partent);
}
}
Best regards
A better algorithm to efficiently calculate positive integer powers is repeatedly square the base, while keeping track of the extra remainder multiplicands. Here is a sample solution in Python that should be relatively easy to understand and translate into your preferred language:
def power(base, exponent):
remaining_multiplicand = 1
result = base
while exponent > 1:
remainder = exponent % 2
if remainder > 0:
remaining_multiplicand = remaining_multiplicand * result
exponent = (exponent - remainder) / 2
result = result * result
return result * remaining_multiplicand
To make it handle negative exponents, all you have to do is calculate the positive version and divide 1 by the result, so that should be a simple modification to the above code. Fractional exponents are considerably more difficult, since it means essentially calculating an nth-root of the base, where n = 1/abs(exponent % 1) and multiplying the result by the result of the integer portion power calculation:
power(base, exponent - (exponent % 1))
You can calculate roots to a desired level of accuracy using Newton's method. Check out wikipedia article on the algorithm.
I am using fixed point long arithmetics and my pow is log2/exp2 based. Numbers consist of:
int sig = { -1; +1 } signum
DWORD a[A+B] number
A is number of DWORDs for integer part of number
B is number of DWORDs for fractional part
My simplified solution is this:
//---------------------------------------------------------------------------
longnum exp2 (const longnum &x)
{
int i,j;
longnum c,d;
c.one();
if (x.iszero()) return c;
i=x.bits()-1;
for(d=2,j=_longnum_bits_b;j<=i;j++,d*=d)
if (x.bitget(j))
c*=d;
for(i=0,j=_longnum_bits_b-1;i<_longnum_bits_b;j--,i++)
if (x.bitget(j))
c*=_longnum_log2[i];
if (x.sig<0) {d.one(); c=d/c;}
return c;
}
//---------------------------------------------------------------------------
longnum log2 (const longnum &x)
{
int i,j;
longnum c,d,dd,e,xx;
c.zero(); d.one(); e.zero(); xx=x;
if (xx.iszero()) return c; //**** error: log2(0) = infinity
if (xx.sig<0) return c; //**** error: log2(negative x) ... no result possible
if (d.geq(x,d)==0) {xx=d/xx; xx.sig=-1;}
i=xx.bits()-1;
e.bitset(i); i-=_longnum_bits_b;
for (;i>0;i--,e>>=1) // integer part
{
dd=d*e;
j=dd.geq(dd,xx);
if (j==1) continue; // dd> xx
c+=i; d=dd;
if (j==2) break; // dd==xx
}
for (i=0;i<_longnum_bits_b;i++) // fractional part
{
dd=d*_longnum_log2[i];
j=dd.geq(dd,xx);
if (j==1) continue; // dd> xx
c.bitset(_longnum_bits_b-i-1); d=dd;
if (j==2) break; // dd==xx
}
c.sig=xx.sig;
c.iszero();
return c;
}
//---------------------------------------------------------------------------
longnum pow (const longnum &x,const longnum &y)
{
//x^y = exp2(y*log2(x))
int ssig=+1; longnum c; c=x;
if (y.iszero()) {c.one(); return c;} // ?^0=1
if (c.iszero()) return c; // 0^?=0
if (c.sig<0)
{
c.overflow(); c.sig=+1;
if (y.isreal()) {c.zero(); return c;} //**** error: negative x ^ noninteger y
if (y.bitget(_longnum_bits_b)) ssig=-1;
}
c=exp2(log2(c)*y); c.sig=ssig; c.iszero();
return c;
}
//---------------------------------------------------------------------------
where:
_longnum_bits_a = A*32
_longnum_bits_b = B*32
_longnum_log2[i] = 2 ^ (1/(2^i)) ... precomputed sqrt table
_longnum_log2[0]=sqrt(2)
_longnum_log2[1]=sqrt[tab[0])
_longnum_log2[i]=sqrt(tab[i-1])
longnum::zero() sets *this=0
longnum::one() sets *this=+1
bool longnum::iszero() returns (*this==0)
bool longnum::isnonzero() returns (*this!=0)
bool longnum::isreal() returns (true if fractional part !=0)
bool longnum::isinteger() returns (true if fractional part ==0)
int longnum::bits() return num of used bits in number counted from LSB
longnum::bitget()/bitset()/bitres()/bitxor() are bit access
longnum.overflow() rounds number if there was a overflow X.FFFFFFFFFF...FFFFFFFFF??h -> (X+1).0000000000000...000000000h
int longnum::geq(x,y) is comparition |x|,|y| returns 0,1,2 for (<,>,==)
All you need to understand this code is that numbers in binary form consists of sum of powers of 2, when you need to compute 2^num then it can be rewritten as this
2^(b(-n)*2^(-n) + ... + b(+m)*2^(+m))
where n are fractional bits and m are integer bits. multiplication/division by 2 in binary form is simple bit shifting so if you put it all together you get code for exp2 similar to my. log2 is based on binaru search...changing the result bits from MSB to LSB until it matches searched value (very similar algorithm as for fast sqrt computation). hope this helps clarify things...
A lot of approaches are given in other answers. Here is something that I thought may be useful in case of integral powers.
In the case of integer power x of nx, the straightforward approach would take x-1 multiplications. In order to optimize this, we can use dynamic programming and reuse an earlier multiplication result to avoid all x multiplications. For example, in 59, we can, say, make batches of 3, i.e. calculate 53 once, get 125 and then cube 125 using the same logic, taking only 4 multiplcations in the process, instead of 8 multiplications with the straightforward way.
The question is what is the ideal size of the batch b so that the number of multiplications is minimum. So let's write the equation for this. If f(x,b) is the function representing the number of multiplications entailed in calculating nx using the above method, then
Explanation: A product of batch of p numbers will take p-1 multiplications. If we divide x multiplications into b batches, there would be (x/b)-1 multiplications required inside each batch, and b-1 multiplications required for all b batches.
Now we can calculate the first derivative of this function with respect to b and equate it to 0 to get the b for the least number of multiplications.
Now put back this value of b into the function f(x,b) to get the least number of multiplications:
For all positive x, this value is lesser than the multiplications by the straightforward way.
maybe you can use taylor series expansion. the Taylor series of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor series are equal near this point. Taylor's series are named after Brook Taylor who introduced them in 1715.