Dividing two integers and rounding up the result, without using floating point

Dividing two integers and rounding up the result, without using floating point - c++

I need to divide two numbers and round it up. Are there any better way to do this?
int myValue = (int) ceil( (float)myIntNumber / myOtherInt );
I find an overkill to have to cast two different time. (the extern int cast is just to shut down the warning)
Note I have to cast internally to float otherwise
int a = ceil(256/11); //> Should be 24, but it is 23
^example

Assuming that both myIntNumber and myOtherInt are positive, you could do:
int myValue = (myIntNumber + myOtherInt - 1) / myOtherInt;

With help from DyP, came up with the following branchless formula:
int idiv_ceil ( int numerator, int denominator )
{
return numerator / denominator
+ (((numerator < 0) ^ (denominator > 0)) && (numerator%denominator));
}
It avoids floating-point conversions and passes a basic suite of unit tests, as shown here:
http://ideone.com/3OrviU
Here's another version that avoids the modulo operator.
int idiv_ceil ( int numerator, int denominator )
{
int truncated = numerator / denominator;
return truncated + (((numerator < 0) ^ (denominator > 0)) &&
(numerator - truncated*denominator));
}
http://ideone.com/Z41G5q
The first one will be faster on processors where IDIV returns both quotient and remainder (and the compiler is smart enough to use that).

Maybe it is just easier to do a:
int result = dividend / divisor;
if(dividend % divisor != 0)
result++;

Benchmarks
Since a lot of different methods are shown in the answers and none of the answers actually prove any advantages in terms of performance I tried to benchmark them myself. My plan was to write an answer that contains a short table and a definite answer which method is the fastest.
Unfortunately it wasn't that simple. (It never is.) It seems that the performance of the rounding formulas depend on the used data type, compiler and optimization level.
In one case there is an increase of speed by 7.5× from one method to another. So the impact can be significant for some people.
TL;DR
For long integers the naive version using a type cast to float and std::ceil was actually the fastest. This was interesting for me personally since I intended to use it with size_t which is usually defined as unsigned long.
For ordinary ints it depends on your optimization level. For lower levels #Jwodder's solution performs best. For the highest levels std::ceil was the optimal one. With one exception: For the clang/unsigned int combination #Jwodder's was still better.
The solutions from the accepted answer never really outperformed the other two. You should keep in mind however that #Jwodder's solution doesn't work with negatives.
Results are at the bottom.
Implementations
To recap here are the four methods I benchmarked and compared:
#Jwodder's version (Jwodder)
template<typename T>
inline T divCeilJwodder(const T& numerator, const T& denominator)
{
return (numerator + denominator - 1) / denominator;
}
#Ben Voigt's version using modulo (VoigtModulo)
template<typename T>
inline T divCeilVoigtModulo(const T& numerator, const T& denominator)
{
return numerator / denominator + (((numerator < 0) ^ (denominator > 0))
&& (numerator%denominator));
}
#Ben Voigt's version without using modulo (VoigtNoModulo)
inline T divCeilVoigtNoModulo(const T& numerator, const T& denominator)
{
T truncated = numerator / denominator;
return truncated + (((numerator < 0) ^ (denominator > 0))
&& (numerator - truncated*denominator));
}
OP's implementation (TypeCast)
template<typename T>
inline T divCeilTypeCast(const T& numerator, const T& denominator)
{
return (int)std::ceil((double)numerator / denominator);
}
Methodology
In a single batch the division is performed 100 million times. Ten batches are calculated for each combination of Compiler/Optimization level, used data type and used implementation. The values shown below are the averages of all 10 batches in milliseconds. The errors that are given are standard deviations.
The whole source code that was used can be found here. Also you might find this script useful which compiles and executes the source with different compiler flags.
The whole benchmark was performed on a i7-7700K. The used compiler versions were GCC 10.2.0 and clang 11.0.1.
Results
Now without further ado here are the results:
DataTypeAlgorithm
GCC-O0
-O1
-O2
-O3
-Os
-Ofast
-Og
clang-O0
-O1
-O2
-O3
-Ofast
-Os
-Oz
unsigned
Jwodder
264.1 ± 0.9 🏆
175.2 ± 0.9 🏆
153.5 ± 0.7 🏆
175.2 ± 0.5 🏆
153.3 ± 0.5
153.4 ± 0.8
175.5 ± 0.6 🏆
329.4 ± 1.3 🏆
220.0 ± 1.3 🏆
146.2 ± 0.6 🏆
146.2 ± 0.6 🏆
146.0 ± 0.5 🏆
153.2 ± 0.3 🏆
153.5 ± 0.6 🏆
VoigtModulo
528.5 ± 2.5
306.5 ± 1.0
175.8 ± 0.7
175.2 ± 0.5 🏆
175.6 ± 0.7
175.4 ± 0.6
352.0 ± 1.0
588.9 ± 6.4
408.7 ± 1.5
164.8 ± 1.0
164.0 ± 0.4
164.1 ± 0.4
175.2 ± 0.5
175.8 ± 0.9
VoigtNoModulo
375.3 ± 1.5
175.7 ± 1.3 🏆
192.5 ± 1.4
197.6 ± 1.9
200.6 ± 7.2
176.1 ± 1.5
197.9 ± 0.5
541.0 ± 1.8
263.1 ± 0.9
186.4 ± 0.6
186.4 ± 1.2
187.2 ± 1.1
197.2 ± 0.8
197.1 ± 0.7
TypeCast
348.5 ± 2.7
231.9 ± 3.9
234.4 ± 1.3
226.6 ± 1.0
137.5 ± 0.8 🏆
138.7 ± 1.7 🏆
243.8 ± 1.4
591.2 ± 2.4
591.3 ± 2.6
155.8 ± 1.9
155.9 ± 1.6
155.9 ± 2.4
214.6 ± 1.9
213.6 ± 1.1
unsigned long
Jwodder
658.6 ± 2.5
546.3 ± 0.9
549.3 ± 1.8
549.1 ± 2.8
540.6 ± 3.4
548.8 ± 1.3
486.1 ± 1.1
638.1 ± 1.8
613.3 ± 2.1
190.0 ± 0.8 🏆
182.7 ± 0.5
182.4 ± 0.5
496.2 ± 1.3
554.1 ± 1.0
VoigtModulo
1,169.0 ± 2.9
1,015.9 ± 4.4
550.8 ± 2.0
504.0 ± 1.4
550.3 ± 1.2
550.5 ± 1.3
1,020.1 ± 2.9
1,259.0 ± 9.0
1,136.5 ± 4.2
187.0 ± 3.4 🏆
199.7 ± 6.1
197.6 ± 1.0
549.4 ± 1.7
506.8 ± 4.4
VoigtNoModulo
768.1 ± 1.7
559.1 ± 1.8
534.4 ± 1.6
533.7 ± 1.5
559.5 ± 1.7
534.3 ± 1.5
571.5 ± 1.3
879.5 ± 10.8
617.8 ± 2.1
223.4 ± 1.3
231.3 ± 1.3
231.4 ± 1.1
594.6 ± 1.9
572.2 ± 0.8
TypeCast
353.3 ± 2.5 🏆
267.5 ± 1.7 🏆
248.0 ± 1.6 🏆
243.8 ± 1.2 🏆
154.2 ± 0.8 🏆
154.1 ± 1.0 🏆
263.8 ± 1.8 🏆
365.5 ± 1.6 🏆
316.9 ± 1.8 🏆
189.7 ± 2.1 🏆
156.3 ± 1.8 🏆
157.0 ± 2.2 🏆
155.1 ± 0.9 🏆
176.5 ± 1.2 🏆
int
Jwodder
307.9 ± 1.3 🏆
175.4 ± 0.9 🏆
175.3 ± 0.5 🏆
175.4 ± 0.6 🏆
175.2 ± 0.5
175.1 ± 0.6
175.1 ± 0.5 🏆
307.4 ± 1.2 🏆
219.6 ± 0.6 🏆
146.0 ± 0.3 🏆
153.5 ± 0.5
153.6 ± 0.8
175.4 ± 0.7 🏆
175.2 ± 0.5 🏆
VoigtModulo
528.5 ± 1.9
351.9 ± 4.6
175.3 ± 0.6 🏆
175.2 ± 0.4 🏆
197.1 ± 0.6
175.2 ± 0.8
373.5 ± 1.1
598.7 ± 5.1
460.6 ± 1.3
175.4 ± 0.4
164.3 ± 0.9
164.0 ± 0.4
176.3 ± 1.6 🏆
460.0 ± 0.8
VoigtNoModulo
398.0 ± 2.5
241.0 ± 0.7
199.4 ± 5.1
219.2 ± 1.0
175.9 ± 1.2
197.7 ± 1.2
242.9 ± 3.0
543.5 ± 2.3
350.6 ± 1.0
186.6 ± 1.2
185.7 ± 0.3
186.3 ± 1.1
197.1 ± 0.6
373.3 ± 1.6
TypeCast
338.8 ± 4.9
228.1 ± 0.9
230.3 ± 0.8
229.5 ± 9.4
153.8 ± 0.4 🏆
138.3 ± 1.0 🏆
241.1 ± 1.1
590.0 ± 2.1
589.9 ± 0.8
155.2 ± 2.4
149.4 ± 1.6 🏆
148.4 ± 1.2 🏆
214.6 ± 2.2
211.7 ± 1.6
long
Jwodder
758.1 ± 1.8
600.6 ± 0.9
601.5 ± 2.2
601.5 ± 2.8
581.2 ± 1.9
600.6 ± 1.8
586.3 ± 3.6
745.9 ± 3.6
685.8 ± 2.2
183.1 ± 1.0
182.5 ± 0.5
182.6 ± 0.6
553.2 ± 1.5
488.0 ± 0.8
VoigtModulo
1,360.8 ± 6.1
1,202.0 ± 2.1
600.0 ± 2.4
600.0 ± 3.0
607.0 ± 6.8
599.0 ± 1.6
1,187.2 ± 2.6
1,439.6 ± 6.7
1,346.5 ± 2.9
197.9 ± 0.7
208.2 ± 0.6
208.0 ± 0.4
548.9 ± 1.4
1,326.4 ± 3.0
VoigtNoModulo
844.5 ± 6.9
647.3 ± 1.3
628.9 ± 1.8
627.9 ± 1.6
629.1 ± 2.4
629.6 ± 4.4
668.2 ± 2.7
1,019.5 ± 3.2
715.1 ± 8.2
224.3 ± 4.8
219.0 ± 1.0
219.0 ± 0.6
561.7 ± 2.5
769.4 ± 9.3
TypeCast
366.1 ± 0.8 🏆
246.2 ± 1.1 🏆
245.3 ± 1.8 🏆
244.6 ± 1.1 🏆
154.6 ± 1.6 🏆
154.3 ± 0.5 🏆
257.4 ± 1.5 🏆
591.8 ± 4.1 🏆
590.4 ± 1.3 🏆
154.5 ± 1.3 🏆
135.4 ± 8.3 🏆
132.9 ± 0.7 🏆
132.8 ± 1.2 🏆
177.4 ± 2.3 🏆
Now I can finally get on with my life :P

Integer division with round-up.
Only 1 division executed per call, no % or * or conversion to/from floating point, works for positive and negative int. See note (1).
n (numerator) = OPs myIntNumber;
d (denominator) = OPs myOtherInt;
The following approach is simple. int division rounds toward 0. For negative quotients, this is a round up so nothing special is needed. For positive quotients, add d-1 to effect a round up, then perform an unsigned division.
Note (1) The usual divide by 0 blows things up and MININT/-1 fails as expected on 2's compliment machines.
int IntDivRoundUp(int n, int d) {
// If n and d are the same sign ...
if ((n < 0) == (d < 0)) {
// If n (and d) are negative ...
if (n < 0) {
n = -n;
d = -d;
}
// Unsigned division rounds down. Adding d-1 to n effects a round up.
return (((unsigned) n) + ((unsigned) d) - 1)/((unsigned) d);
}
else {
return n/d;
}
}
[Edit: test code removed, see earlier rev as needed]

Just use
int ceil_of_division = ((dividend-1)/divisor)+1;
For example:
for (int i=0;i<20;i++)
std::cout << i << "/8 = " << ((i-1)/8)+1 << std::endl;

A small hack is to do:
int divideUp(int a, int b) {
result = (a-1)/b + 1;
}
// Proof:
a = b*N + k (always)
if k == 0, then
(a-1) == b*N - 1
(a-1)/b == N - 1
(a-1)/b + 1 == N ---> Good !
if k > 0, then
(a-1) == b*N + l
(a-1)/b == N
(a-1)/b + 1 == N+1 ---> Good !

Instead of using the ceil function before casting to int, you can add a constant which is very nearly (but not quite) equal to 1 - this way, nearly anything (except a value which is exactly or incredibly close to an actual integer) will be increased by one before it is truncated.
Example:
#define EPSILON (0.9999)
int myValue = (int)(((float)myIntNumber)/myOtherInt + EPSILON);
EDIT: after seeing your response to the other post, I want to clarify that this will round up, not away from zero - negative numbers will become less negative, and positive numbers will become more positive.

Related

Understanding this way of computing ( x^e mod n)

working on a RSA encryption in c++ and found that the pow() function in cmath was giving me the incorrect result.
after looking online I came across some code that would do the above process for me but I am having difficulty understanding it.
this is the code:
long long int modpow(long long int base, long long int exp, long long int modulus) {
base %= modulus;
long long int result = 1;
while (exp > 0) {
if (exp & 1) {
result = (result * base) % modulus;
}
base = (base * base) % modulus;
exp >>= 1;
}
return result;
}
(this code isn't the original)
I am struggling with understanding this function.
I know that exp >>=1; is a left shift 1 bit and (exp & 1) returns 1 or 0 based on the least significant bit but what i dont understand is how that contributes to the final answer.
for example:
if (exp & 1) {
result = (result * base) % modulus;
}
what is the purpose of (result * base) % modulus if exp is odd?
hoping someone can please explain this function to me as I dont want to just copy it over.

The code was written to be "clever" instead of clear. This cryptic style usually should not be done outside core libraries (where performance is crucial), and even when it is used, it would be nice to have comments explaining what is going on.
Here is an annotated version of the code.
long long int modpow(long long int base, long long int exp, long long int modulus)
{
base %= modulus; // Eliminate factors; keeps intermediate results smaller
long long int result = 1; // Start with the multiplicative unit (a.k.a. one)
// Throughout, we are calculating:
// `(result*base^exp)%modulus`
while (exp > 0) { // While the exponent has not been exhausted.
if (exp & 1) { // If the exponent is odd
result = (result * base) % modulus; // Consume one application of the base, logically
// (but not actually) reducing the exponent by one.
// That is: result * base^exp == (result*base)*base^(exp-1)
}
base = (base * base) % modulus; // The exponent is logically even. Apply B^(2n) == (B^2)^n.
exp >>= 1; // The base is squared and the exponent is divided by 2
}
return result;
}
Does that make more sense now?
For those wondering how this cryptic code could be made clearer, I present the following version. There are three main improvements.
First, bitwise operations have been replaced by the equivalent arithmetic operations. If one were to prove that the algorithm works, one would use arithmetic, not bitwise operators. In fact, the algorithm works regardless of how the numbers are represented – there need not be a concept of "bits", much less of "bitwise operators". Thus the natural way to implement this algorithm is with arithmetic. Using bitwise operators removes clarity with almost no benefit. Compilers are smart enough to produce identical machine code, with one exception. Since exp was declared long long int instead of long long unsigned, there is an extra step when calculating exp /= 2 compared to exp >>= 1. (I do not know why exp is signed; the function is both conceptually meaningless and technically incorrect for negative exponents.) See also premature-optimization.
Second, I created a helper function for improved readability. While the improvement is minor, it comes at no performance cost. I'd expect the function to be inlined by any compiler worth its salt.
// Wrapper for detecting if an integer is odd.
bool is_odd(long long int n)
{
return n % 2 != 0;
}
Third, comments have been added to explain what is going on. While some people (not I) might think that "the standard right-to-left modular binary exponentiation algorithm" is required knowledge for every C++ coder, I prefer to make fewer assumptions about the people who might read my code in the future. Especially if that person is me, coming back to the code after years away from it.
And now, the code, as I would prefer to see the current functionality written:
// Returns `(base**exp) % modulus`, where `**` denotes exponentiation.
// Assumes `exp` is non-negative.
// Assumes `modulus` is non-zero.
// If `exp` is zero, assumes `modulus` is neither 1 nor -1.
long long int modpow(long long int base, long long int exp, long long int modulus)
{
// NOTE: This algorithm is known as the "right-to-left binary method" of
// "modular exponentiation".
// Throughout, we'll keep numbers smallish by using `(A*B) % C == ((A%C)*B) % C`.
// The first application of this principle is to the base.
base %= modulus;
// Intermediate results will be stored modulo `modulus`.
long long int result = 1;
// Loop invariant:
// The value to return is `(result * base**exp) % modulus`.
// Loop goal:
// Reduce `exp` to the point where `base**exp` is 1.
while (exp > 0) {
if ( is_odd(exp) ) {
// Shift one factor of `base` to `result`:
// `result * base^exp == (result*base) * base^(exp-1)`
result = (result * base) % modulus;
//--exp; // logically happens, but optimized out.
// We are now in the "`exp` is even" case.
}
// Reduce the exponent by increasing the base: `B**(2n) == (B**2)**n`.
base = (base * base) % modulus;
exp /= 2;
}
return result;
}
The resulting machine code is almost identical. If performance really is critical, I could see going back to exp >>= 1, but only if changing the type of exp is not allowed.

Newton-Raphson Division With Big Integers

I'm making a BigInt class as a programming exercise. It uses a vector of 2's complement signed ints in base-65536 (so that 32-bit multiplications don't overflow. I will increase the base once I get it fully working).
All of the basic math operations are coded, with one problem: division is painfully slow with the basic algorithm I was able to create. (It kind of works like binary division for each digit of the quotient... I'm not going to post it unless someone wants to see it....)
Instead of my slow algorithm, I want to use Newton-Raphson to find the (shifted) reciprocal and then multiply (and shift). I think I have my head around the basics: you give the formula (x1 = x0(2 - x0 * divisor)) a good initial guess, and then after some amount of iterations, x converges to the reciprocal. This part seems easy enough... but I am running into some problems when trying to apply this formula to big integers:
Problem 1:
Because I am working with integers... well... I can't use fractions. This seems to cause x to always diverge (x0 * divisor must be <2 it seems?). My intuition tells me there should be some modification to the equation that would allow it to work integers (to some accuracy) but I am really struggling to find out what it is. (My lack of math skills are beating me up here....) I think I need to find some equivalent equation where instead of d there is d*[base^somePower]? Can there be some equation like (x1 = x0(2 - x0 * d)) that works with whole numbers?
Problem 2:
When I use Newton's formula to find the reciprocal of some numbers, the result ends up being just a small faction below what the answer should be... ex. when trying to find reciprocal of 4 (in decimal):
x0 = 0.3
x1 = 0.24
x2 = 0.2496
x3 = 0.24999936
x4 = 0.2499999999983616
x5 = 0.24999999999999999999998926258176
If I were representing numbers in base-10, I would want a result of 25 (and to remember to right shift product by 2). With some reciprocals such as 1/3, you can simply truncate the result after you know you have enough accuracy. But how can I pull out the correct reciprocal from the above result?
Sorry if this is all too vague or if I'm asking for too much. I looked through Wikipedia and all of the research papers I could find on Google, but I feel like I'm banging my head against a wall. I appreciate any help anyone can give me!
...
Edit: Got the algorithm working, although it is much slower than I expected. I actually lost a lot of speed compared to my old algorithm, even on numbers with thousands of digits... I'm still missing something. It's not a problem with multiplication, which is very fast. (I am indeed using Karatsuba's algorithm).
For anyone interested, here is my current iteration of the Newton-Raphson algorithm:
bigint operator/(const bigint& lhs, const bigint& rhs) {
if (rhs == 0) throw overflow_error("Divide by zero exception");
bigint dividend = lhs;
bigint divisor = rhs;
bool negative = 0;
if (dividend < 0) {
negative = !negative;
dividend.invert();
}
if (divisor < 0) {
negative = !negative;
divisor.invert();
}
int k = dividend.numBits() + divisor.numBits();
bigint pow2 = 1;
pow2 <<= k + 1;
bigint x = dividend - divisor;
bigint lastx = 0;
bigint lastlastx = 0;
while (1) {
x = (x * (pow2 - x * divisor)) >> k;
if (x == lastx || x == lastlastx) break;
lastlastx = lastx;
lastx = x;
}
bigint quotient = dividend * x >> k;
if (dividend - (quotient * divisor) >= divisor) quotient++;
if (negative)quotient.invert();
return quotient;
}
And here is my (really ugly) old algorithm that is faster:
bigint operator/(const bigint& lhs, const bigint & rhs) {
if (rhs == 0) throw overflow_error("Divide by zero exception");
bigint dividend = lhs;
bigint divisor = rhs;
bool negative = 0;
if (dividend < 0) {
negative = !negative;
dividend.invert();
}
if (divisor < 0) {
negative = !negative;
divisor.invert();
}
bigint remainder = 0;
bigint quotient = 0;
while (dividend.value.size() > 0) {
remainder.value.insert(remainder.value.begin(), dividend.value.at(dividend.value.size() - 1));
remainder.value.push_back(0);
remainder.unPad();
dividend.value.pop_back();
if (divisor > remainder) {
quotient.value.push_back(0);
} else {
int count = 0;
int i = MSB;
bigint value = 0;
while (i > 0) {
bigint increase = divisor * i;
bigint next = value + increase;
if (next <= remainder) {
value = next;
count += i;
}
i >>= 1;
}
quotient.value.push_back(count);
remainder -= value;
}
}
for (int i = 0; i < quotient.value.size() / 2; i++) {
int swap = quotient.value.at(i);
quotient.value.at(i) = quotient.value.at((quotient.value.size() - 1) - i);
quotient.value.at(quotient.value.size() - 1 - i) = swap;
}
if (negative)quotient.invert();
quotient.unPad();
return quotient;
}

First of all, you can implement division in time O(n^2) and with reasonable constant, so it's not (much) slower than the naive multiplication. However, if you use Karatsuba-like algorithm, or even FFT-based multiplication algorithm, then you indeed can speedup your division algorithm using Newton-Raphson.
A Newton-Raphson iteration for calculating the reciprocal of x is q[n+1]=q[n]*(2-q[n]*x).
Suppose we want to calculate floor(2^k/B) where B is a positive integer. WLOG, B≤2^k; otherwise, the quotient is 0. The Newton-Raphson iteration for x=B/2^k yields q[n+1]=q[n]*(2-q[n]*B/2^k). we can rearrange it as
q[n+1]=q[n]*(2^(k+1)-q[n]*B) >> k
Each iteration of this kind requires only integer multiplications and bit shifts. Does it converge to floor(2^k/B)? Not necessarily. However, in the worst case, it eventually alternates between floor(2^k/B) and ceiling(2^k/B) (Prove it!). So you can use some not-so-clever test to see if you are in this case, and extract floor(2^k/B). (this "not-so-clever test" should be alot faster than the multiplications in each iteration; However, it will be nice to optimize this thing).
Indeed, calculating floor(2^k/B) suffices in order to calculate floor(A/B) for any positive integers A,B. Take k such that A*B≤2^k, and verify floor(A/B)=A*ceiling(2^k/B) >> k.
Lastly, a simple but important optimization for this approach is to truncate multiplications (i.e. calculate only the higher bits of the product) in the early iterations of the Newton-Raphson method. The reason to do so, is that the results of the early iterations are far from the quotient, and it doesn't matter to perform them inaccurately. (Refine this argument and show that if you do this thing appropriately, you can divide two ≤n-bit integers in time O(M(2n)), assuming you can multiply two ≤k-bit integers in time M(k), and M(x) is an increasing convex function).

If I see this correctly a major improvement is picking a good starting value for x. Knowing how many digits the divisor has you know where the most significant bit of the inverse has to be, as
1/x = pow(2,log2(1/x))
1/x = pow(2,-log2(x))
1/x >= pow(2,-floor(log2(x)))
floor(log2(x)) simply is the index of the most significant bit set.
As suggested in the comment by the op using a 256-bit lookup table is a going to speed up the convergence even more, because each step roughly doubles the amount of correct digits. Starting with 8 correct digits is better than starting with 1 and much better than starting with even less than that.
constexpr fixpoint_integer_inverse(const T& d) {
uint8_t lut[256] = { 255u,254u,253u,252u,251u,250u,249u,248u,247u,246u,245u,244u,243u,242u,241u,
240u,240u,239u,238u,237u,236u,235u,234u,234u,233u,232u,231u,230u,229u,229u,228u,
227u,226u,225u,225u,224u,223u,222u,222u,221u,220u,219u,219u,218u,217u,217u,216u,
215u,214u,214u,213u,212u,212u,211u,210u,210u,209u,208u,208u,207u,206u,206u,205u,
204u,204u,203u,202u,202u,201u,201u,200u,199u,199u,198u,197u,197u,196u,196u,195u,
195u,194u,193u,193u,192u,192u,191u,191u,190u,189u,189u,188u,188u,187u,187u,186u,
186u,185u,185u,184u,184u,183u,183u,182u,182u,181u,181u,180u,180u,179u,179u,178u,
178u,177u,177u,176u,176u,175u,175u,174u,174u,173u,173u,172u,172u,172u,171u,171u,
170u,170u,169u,169u,168u,168u,168u,167u,167u,166u,166u,165u,165u,165u,164u,164u,
163u,163u,163u,162u,162u,161u,161u,161u,160u,160u,159u,159u,159u,158u,158u,157u,
157u,157u,156u,156u,156u,155u,155u,154u,154u,154u,153u,153u,153u,152u,152u,152u,
151u,151u,151u,150u,150u,149u,149u,149u,148u,148u,148u,147u,147u,147u,146u,146u,
146u,145u,145u,145u,144u,144u,144u,144u,143u,143u,143u,142u,142u,142u,141u,141u,
141u,140u,140u,140u,140u,139u,139u,139u,138u,138u,138u,137u,137u,137u,137u,136u,
136u,136u,135u,135u,135u,135u,134u,134u,134u,134u,133u,133u,133u,132u,132u,132u,
132u,131u,131u,131u,131u,130u,130u,130u,130u,129u,129u,129u,129u,128u,128u,128u,
127u
};
const auto l = log2(d);
T x;
if (l<8) {
x = T(1)<<(digits(d)-1-l);
} else {
if (digits(d)>(l+8)) x = T(lut[(d>>(l-8))-256])<<(digits(d)-l-8);
else x = T(lut[(d>>(l-8))-256])>>(l+8-digits(d));
}
if (x==0) x=1;
while(true) {
const auto lm = long_mul(x,T(1)-x*d);
const T i = get<0>(lm);
if (i) x+=i;
else return x;
}
return x;
}
// calculate a * b = r0r1
template<typename T>
typename std::enable_if<std::is_unsigned<T>::value,tuple<T,T>>::type
constexpr long_mul(const T& a, const T& b){
const T N = digits<T>()/2;
const T t0 = (a>>N)*(b>>N);
const T t1 = ((a<<N)>>N)*(b>>N);
const T t2 = (a>>N)*((b<<N)>>N);
const T t3 = ((a<<N)>>N)*((b<<N)>>N);
const T t4 = t3+(t1<<N);
const T r1 = t4+(t2<<N);
const T r0 = (r1<t4)+(t4<t3)+(t1>>N)+(t2>>N)+t0;
return {r0,r1};
}

Newton-Raphson is an approximation algorithm - not appropriate for use in integer math. You will get rounding errors which will result in the kind of problems you are seeing. You could do the problem with floating point numbers and then see if you get an integer, precise to a specified number of digits (see next paragraph)
As to the second problem, pick a precision (number of decimal places) you want for accuracy and round to that precision. If you picked twenty digits of precision in the problem, you would round to 0.25. You simply need to iterate until your required digits of precision are stable. In general, representing irrational numbers on a computer often introduces imprecision.

Is div function useful (stdlib.h)? [duplicate]

This question already has answers here:
What is the purpose of the div() library function?
(6 answers)
Closed 3 years ago.
There is a function called div in C,C++ (stdlib.h)
div_t div(int numer, int denom);
typedef struct _div_t
{
int quot;
int rem;
} div_t;
But C,C++ have / and % operators.
My question is: "When there are / and % operators, Is div function useful?"

Yes, it is: it calculates the quotient and remainder in one operation.
Aside from that, the same behaviour can be achieved with /+% (and a decent optimizer will optimize them into a single div anyway).
In order to sum it up: if you care about squeezing out last bits of performance, this may be your function of choice, especially if the optimizer on your platform is not so advanced. This is often the case for embedded platforms. Otherwise, use whatever way you find more readable.

The div() function returns a structure which contains the quotient and remainder of the division of the first parameter (the numerator) by the second (the denominator). There are four variants:
div_t div(int, int)
ldiv_t ldiv(long, long)
lldiv_t lldiv(long long, long long)
imaxdiv_t imaxdiv(intmax_t, intmax_t (intmax_t represents the biggest integer type available on the system)
The div_t structure looks like this:
typedef struct
{
int quot; /* Quotient. */
int rem; /* Remainder. */
} div_t;
The implementation does simply use the / and % operators, so it's not exactly a very complicated or necessary function, but it is part of the C standard (as defined by [ISO 9899:201x][1]).
See the implementation in GNU libc:
/* Return the `div_t' representation of NUMER over DENOM. */
div_t
div (numer, denom)
int numer, denom;
{
div_t result;
result.quot = numer / denom;
result.rem = numer % denom;
/* The ANSI standard says that |QUOT| <= |NUMER / DENOM|, where
NUMER / DENOM is to be computed in infinite precision. In
other words, we should always truncate the quotient towards
zero, never -infinity. Machine division and remainer may
work either way when one or both of NUMER or DENOM is
negative. If only one is negative and QUOT has been
truncated towards -infinity, REM will have the same sign as
DENOM and the opposite sign of NUMER; if both are negative
and QUOT has been truncated towards -infinity, REM will be
positive (will have the opposite sign of NUMER). These are
considered `wrong'. If both are NUM and DENOM are positive,
RESULT will always be positive. This all boils down to: if
NUMER >= 0, but REM < 0, we got the wrong answer. In that
case, to get the right answer, add 1 to QUOT and subtract
DENOM from REM. */
if (numer >= 0 && result.rem < 0)
{
++result.quot;
result.rem -= denom;
}
return result;
}

The semantics of div() is different than the semantics of % and /, which is important in some cases.
That is why the following code is in the implementation shown in psYchotic's answer:
if (numer >= 0 && result.rem < 0)
{
++result.quot;
result.rem -= denom;
}
% may return a negative answer, whereas div() always returns a non-negative remainder.
Check the WikiPedia entry, particularly "div always rounds towards 0, unlike ordinary integer division in C, where rounding for negative numbers is implementation-dependent."

div() filled a pre-C99 need: portability
Pre C99, the rounding direction of the quotient of a / b with a negative operand was implementation dependent. With div(), the rounding direction is not optional but specified to be toward 0. div() provided uniform portable division. A secondary use was the potential efficiency when code needed to calculate both the quotient and remainder.
With C99 and later, div() and / specifying the same round direction and with better compilers optimizing nearby a/b and a%b code, the need has diminished.
This was the compelling reason for div() and it explains the absence of udiv_t udiv(unsigned numer, unsigned denom) in the C spec: The issues of implementation dependent results of a/b with negative operands are non-existent for unsigned even in pre-C99.

Probably because on many processors the div instruction produces both values and you can always count on the compiler to recognize that adjacent / and % operators on the same inputs could be coalesced into one operation.

It costs less time if you need both value.
CPU always calculate both remainder and quotient when performing division.
If use "/" once and "%" once, cpu will calculate twice both number.
(forgive my poor English, I'm not native)

C++ Should this be easier?

long-time listener, first-time caller. I am relatively new to programming and was looking back at some of the code I wrote for an old lab. Is there an easier way to tell if a double is evenly divisible by an integer?
double num (//whatever);
int divisor (//an integer);
bool bananas;
if(floor(num)!= num || static_cast<int>(num)%divisor != 0) {
bananas=false;
}
if(bananas==true)
//do stuff;
}

The question is strange, and the checks are as well. The problem is that it makes little sense to speak about divisibility of a floating point number because floating point number are represented imprecisely in binary, and divisibility is about exactitude.
I encourage you to read this article, by David Goldberg: What Every Computer Scientist Should Know About Floating Point Arithmetic. It is a bit long-winded, so you may appreciate this website, instead: The Floating-Point Guide.
The truth is that floor(num) == num is a strange piece of code.
num is a double
floor(num) returns an double, close to an int
The trouble is that this does not check what you really wanted. For example, suppose (for the sake of example) that 5 cannot be represented exactly as a double, therefore, instead of storing 5, the computer will store 4.999999999999.
double num = 5; // 4.999999999999999
double floored = floor(num); // 4.0
assert(num != floored);
In general exact comparisons are meaningless for floating point numbers, because of rounding errors.
If you insist on using floor, I suggest to use floor(num + 0.5) which is better, though slightly biased. A better rounding method is the Banker's rounding because it is unbiased, and the article references others if you wish. Note that the Banker's rounding is the baked in in round...
As for your question, first you need a double aware modulo: fmod, then you need to remember the avoid exact comparisons bit.
A first (naive) attempt:
// divisor is deemed non-zero
// epsilon is a constant
double mod = fmod(num, divisor); // divisor will be converted to a double
if (mod <= epsilon) { }
Unfortunately it fails one important test: the magnitude of mod depends on the magnitude of divisor, thus if divisor is smaller than epsilon to begin with, it will always be true.
A second attempt:
// divisor is deemed non-zero
double const epsilon = divisor / 1000.0;
double mod = fmod(num, divisor);
if (mod <= epsilon) { }
Better, but not quite there: mod and epsilon are signed! Yes, it's a bizarre modulo, th sign of mod is the sign of num
A third attempt:
// divisor is deemed non-zero
double const eps = fabs(divisor / 1000.0);
double mod = fabs(fmod(num, divisor));
if (mod <= eps) { }
Much better.
Should work fairly well too if divisor comes from an integer, as there won't be precision issues... or at least not too much.
EDIT: fourth attempt, by #ybungalobill
The previous attempt does not deal well with situations where num/divisor errors on the wrong side. Like 1.999/1.000 --> 0.999, it's nearly divisor so we should indicate equality, yet it failed.
// divisor is deemed non-zero
mod = fabs(fmod(num/divisor, 1));
if (mod <= 0.001 || fabs(1 - mod) <= 0.001) { }
Looks like a never ending task eh ?
There is still cause for troubles though.
double has a limited precision, that is a limited number of digits that is representable (16 I think ?). This precision might be insufficient to represent an integer:
Integer n = 12345678901234567890;
double d = n; // 1.234567890123457 * 10^20
This truncation means it is impossible to map it back to its original value. This should not cause any issue with double and int, for example on my platform double is 8 bytes and int is 4 bytes, so it would work, but changing double to float or int to long could violate this assumption, oh hell!
Are you sure you really need floating point, by the way ?

Based on the above comments, I believe you can do this...
double num (//whatever);
int divisor (//an integer);
if(fmod(num, divisor) == 0) {
//do stuff;
}

I haven't checked it but why not do this?
if (floor(num) == num && !(static_cast<int>(num) % divisor)) {
// do stuff...
}

How can I write a power function myself?

I was always wondering how I can make a function which calculates the power (e.g. 23) myself. In most languages these are included in the standard library, mostly as pow(double x, double y), but how can I write it myself?
I was thinking about for loops, but it think my brain got in a loop (when I wanted to do a power with a non-integer exponent, like 54.5 or negatives 2-21) and I went crazy ;)
So, how can I write a function which calculates the power of a real number? Thanks
Oh, maybe important to note: I cannot use functions which use powers (e.g. exp), which would make this ultimately useless.

Negative powers are not a problem, they're just the inverse (1/x) of the positive power.
Floating point powers are just a little bit more complicated; as you know a fractional power is equivalent to a root (e.g. x^(1/2) == sqrt(x)) and you also know that multiplying powers with the same base is equivalent to add their exponents.
With all the above, you can:
Decompose the exponent in a integer part and a rational part.
Calculate the integer power with a loop (you can optimise it decomposing in factors and reusing partial calculations).
Calculate the root with any algorithm you like (any iterative approximation like bisection or Newton method could work).
Multiply the result.
If the exponent was negative, apply the inverse.
Example:
2^(-3.5) = (2^3 * 2^(1/2)))^-1 = 1 / (2*2*2 * sqrt(2))

AB = Log-1(Log(A)*B)
Edit: yes, this definition really does provide something useful. For example, on an x86, it translates almost directly to FYL2X (Y * Log2(X)) and F2XM1 (2x-1):
fyl2x
fld st(0)
frndint
fsubr st(1),st
fxch st(1)
fchs
f2xmi
fld1
faddp st(1),st
fscale
fstp st(1)
The code ends up a little longer than you might expect, primarily because F2XM1 only works with numbers in the range -1.0..1.0. The fld st(0)/frndint/fsubr st(1),st piece subtracts off the integer part, so we're left with only the fraction. We apply F2XM1 to that, add the 1 back on, then use FSCALE to handle the integer part of the exponentiation.

Typically the implementation of the pow(double, double) function in math libraries is based on the identity:
pow(x,y) = pow(a, y * log_a(x))
Using this identity, you only need to know how to raise a single number a to an arbitrary exponent, and how to take a logarithm base a. You have effectively turned a complicated multi-variable function into a two functions of a single variable, and a multiplication, which is pretty easy to implement. The most commonly chosen values of a are e or 2 -- e because the e^x and log_e(1+x) have some very nice mathematical properties, and 2 because it has some nice properties for implementation in floating-point arithmetic.
The catch of doing it this way is that (if you want to get full accuracy) you need to compute the log_a(x) term (and its product with y) to higher accuracy than the floating-point representation of x and y. For example, if x and y are doubles, and you want to get a high accuracy result, you'll need to come up with some way to store intermediate results (and do arithmetic) in a higher-precision format. The Intel x87 format is a common choice, as are 64-bit integers (though if you really want a top-quality implementation, you'll need to do a couple of 96-bit integer computations, which are a little bit painful in some languages). It's much easier to deal with this if you implement powf(float,float), because then you can just use double for intermediate computations. I would recommend starting with that if you want to use this approach.
The algorithm that I outlined is not the only possible way to compute pow. It is merely the most suitable for delivering a high-speed result that satisfies a fixed a priori accuracy bound. It is less suitable in some other contexts, and is certainly much harder to implement than the repeated-square[root]-ing algorithm that some others have suggested.
If you want to try the repeated square[root] algorithm, begin by writing an unsigned integer power function that uses repeated squaring only. Once you have a good grasp on the algorithm for that reduced case, you will find it fairly straightforward to extend it to handle fractional exponents.

There are two distinct cases to deal with: Integer exponents and fractional exponents.
For integer exponents, you can use exponentiation by squaring.
def pow(base, exponent):
if exponent == 0:
return 1
elif exponent < 0:
return 1 / pow(base, -exponent)
elif exponent % 2 == 0:
half_pow = pow(base, exponent // 2)
return half_pow * half_pow
else:
return base * pow(base, exponent - 1)
The second "elif" is what distinguishes this from the naïve pow function. It allows the function to make O(log n) recursive calls instead of O(n).
For fractional exponents, you can use the identity a^b = C^(b*log_C(a)). It's convenient to take C=2, so a^b = 2^(b * log2(a)). This reduces the problem to writing functions for 2^x and log2(x).
The reason it's convenient to take C=2 is that floating-point numbers are stored in base-2 floating point. log2(a * 2^b) = log2(a) + b. This makes it easier to write your log2 function: You don't need to have it be accurate for every positive number, just on the interval [1, 2). Similarly, to calculate 2^x, you can multiply 2^(integer part of x) * 2^(fractional part of x). The first part is trivial to store in a floating point number, for the second part, you just need a 2^x function over the interval [0, 1).
The hard part is finding a good approximation of 2^x and log2(x). A simple approach is to use Taylor series.

Per definition:
a^b = exp(b ln(a))
where exp(x) = 1 + x + x^2/2 + x^3/3! + x^4/4! + x^5/5! + ...
where n! = 1 * 2 * ... * n.
In practice, you could store an array of the first 10 values of 1/n!, and then approximate
exp(x) = 1 + x + x^2/2 + x^3/3! + ... + x^10/10!
because 10! is a huge number, so 1/10! is very small (2.7557319224⋅10^-7).

Wolfram functions gives a wide variety of formulae for calculating powers. Some of them would be very straightforward to implement.

For positive integer powers, look at exponentiation by squaring and addition-chain exponentiation.

Using three self implemented functions iPow(x, n), Ln(x) and Exp(x), I'm able to compute fPow(x, a), x and a being doubles. Neither of the functions below use library functions, but just iteration.
Some explanation about functions implemented:
(1) iPow(x, n): x is double, n is int. This is a simple iteration, as n is an integer.
(2) Ln(x): This function uses the Taylor Series iteration. The series used in iteration is Σ (from int i = 0 to n) {(1 / (2 * i + 1)) * ((x - 1) / (x + 1)) ^ (2 * n + 1)}. The symbol ^ denotes the power function Pow(x, n) implemented in the 1st function, which uses simple iteration.
(3) Exp(x): This function, again, uses the Taylor Series iteration. The series used in iteration is Σ (from int i = 0 to n) {x^i / i!}. Here, the ^ denotes the power function, but it is not computed by calling the 1st Pow(x, n) function; instead it is implemented within the 3rd function, concurrently with the factorial, using d *= x / i. I felt I had to use this trick, because in this function, iteration takes some more steps relative to the other functions and the factorial (i!) overflows most of the time. In order to make sure the iteration does not overflow, the power function in this part is iterated concurrently with the factorial. This way, I overcame the overflow.
(4) fPow(x, a): x and a are both doubles. This function does nothing but just call the other three functions implemented above. The main idea in this function depends on some calculus: fPow(x, a) = Exp(a * Ln(x)). And now, I have all the functions iPow, Ln and Exp with iteration already.
n.b. I used a constant MAX_DELTA_DOUBLE in order to decide in which step to stop the iteration. I've set it to 1.0E-15, which seems reasonable for doubles. So, the iteration stops if (delta < MAX_DELTA_DOUBLE) If you need some more precision, you can use long double and decrease the constant value for MAX_DELTA_DOUBLE, to 1.0E-18 for example (1.0E-18 would be the minimum).
Here is the code, which works for me.
#define MAX_DELTA_DOUBLE 1.0E-15
#define EULERS_NUMBER 2.718281828459045
double MathAbs_Double (double x) {
return ((x >= 0) ? x : -x);
}
int MathAbs_Int (int x) {
return ((x >= 0) ? x : -x);
}
double MathPow_Double_Int(double x, int n) {
double ret;
if ((x == 1.0) || (n == 1)) {
ret = x;
} else if (n < 0) {
ret = 1.0 / MathPow_Double_Int(x, -n);
} else {
ret = 1.0;
while (n--) {
ret *= x;
}
}
return (ret);
}
double MathLn_Double(double x) {
double ret = 0.0, d;
if (x > 0) {
int n = 0;
do {
int a = 2 * n + 1;
d = (1.0 / a) * MathPow_Double_Int((x - 1) / (x + 1), a);
ret += d;
n++;
} while (MathAbs_Double(d) > MAX_DELTA_DOUBLE);
} else {
printf("\nerror: x < 0 in ln(x)\n");
exit(-1);
}
return (ret * 2);
}
double MathExp_Double(double x) {
double ret;
if (x == 1.0) {
ret = EULERS_NUMBER;
} else if (x < 0) {
ret = 1.0 / MathExp_Double(-x);
} else {
int n = 2;
double d;
ret = 1.0 + x;
do {
d = x;
for (int i = 2; i <= n; i++) {
d *= x / i;
}
ret += d;
n++;
} while (d > MAX_DELTA_DOUBLE);
}
return (ret);
}
double MathPow_Double_Double(double x, double a) {
double ret;
if ((x == 1.0) || (a == 1.0)) {
ret = x;
} else if (a < 0) {
ret = 1.0 / MathPow_Double_Double(x, -a);
} else {
ret = MathExp_Double(a * MathLn_Double(x));
}
return (ret);
}

It's an interesting exercise. Here's some suggestions, which you should try in this order:
Use a loop.
Use recursion (not better, but interesting none the less)
Optimize your recursion vastly by using divide-and-conquer
techniques
Use logarithms

You can found the pow function like this:
static double pows (double p_nombre, double p_puissance)
{
double nombre = p_nombre;
double i=0;
for(i=0; i < (p_puissance-1);i++){
nombre = nombre * p_nombre;
}
return (nombre);
}
You can found the floor function like this:
static double floors(double p_nomber)
{
double x = p_nomber;
long partent = (long) x;
if (x<0)
{
return (partent-1);
}
else
{
return (partent);
}
}
Best regards

A better algorithm to efficiently calculate positive integer powers is repeatedly square the base, while keeping track of the extra remainder multiplicands. Here is a sample solution in Python that should be relatively easy to understand and translate into your preferred language:
def power(base, exponent):
remaining_multiplicand = 1
result = base
while exponent > 1:
remainder = exponent % 2
if remainder > 0:
remaining_multiplicand = remaining_multiplicand * result
exponent = (exponent - remainder) / 2
result = result * result
return result * remaining_multiplicand
To make it handle negative exponents, all you have to do is calculate the positive version and divide 1 by the result, so that should be a simple modification to the above code. Fractional exponents are considerably more difficult, since it means essentially calculating an nth-root of the base, where n = 1/abs(exponent % 1) and multiplying the result by the result of the integer portion power calculation:
power(base, exponent - (exponent % 1))
You can calculate roots to a desired level of accuracy using Newton's method. Check out wikipedia article on the algorithm.

I am using fixed point long arithmetics and my pow is log2/exp2 based. Numbers consist of:
int sig = { -1; +1 } signum
DWORD a[A+B] number
A is number of DWORDs for integer part of number
B is number of DWORDs for fractional part
My simplified solution is this:
//---------------------------------------------------------------------------
longnum exp2 (const longnum &x)
{
int i,j;
longnum c,d;
c.one();
if (x.iszero()) return c;
i=x.bits()-1;
for(d=2,j=_longnum_bits_b;j<=i;j++,d*=d)
if (x.bitget(j))
c*=d;
for(i=0,j=_longnum_bits_b-1;i<_longnum_bits_b;j--,i++)
if (x.bitget(j))
c*=_longnum_log2[i];
if (x.sig<0) {d.one(); c=d/c;}
return c;
}
//---------------------------------------------------------------------------
longnum log2 (const longnum &x)
{
int i,j;
longnum c,d,dd,e,xx;
c.zero(); d.one(); e.zero(); xx=x;
if (xx.iszero()) return c; //**** error: log2(0) = infinity
if (xx.sig<0) return c; //**** error: log2(negative x) ... no result possible
if (d.geq(x,d)==0) {xx=d/xx; xx.sig=-1;}
i=xx.bits()-1;
e.bitset(i); i-=_longnum_bits_b;
for (;i>0;i--,e>>=1) // integer part
{
dd=d*e;
j=dd.geq(dd,xx);
if (j==1) continue; // dd> xx
c+=i; d=dd;
if (j==2) break; // dd==xx
}
for (i=0;i<_longnum_bits_b;i++) // fractional part
{
dd=d*_longnum_log2[i];
j=dd.geq(dd,xx);
if (j==1) continue; // dd> xx
c.bitset(_longnum_bits_b-i-1); d=dd;
if (j==2) break; // dd==xx
}
c.sig=xx.sig;
c.iszero();
return c;
}
//---------------------------------------------------------------------------
longnum pow (const longnum &x,const longnum &y)
{
//x^y = exp2(y*log2(x))
int ssig=+1; longnum c; c=x;
if (y.iszero()) {c.one(); return c;} // ?^0=1
if (c.iszero()) return c; // 0^?=0
if (c.sig<0)
{
c.overflow(); c.sig=+1;
if (y.isreal()) {c.zero(); return c;} //**** error: negative x ^ noninteger y
if (y.bitget(_longnum_bits_b)) ssig=-1;
}
c=exp2(log2(c)*y); c.sig=ssig; c.iszero();
return c;
}
//---------------------------------------------------------------------------
where:
_longnum_bits_a = A*32
_longnum_bits_b = B*32
_longnum_log2[i] = 2 ^ (1/(2^i)) ... precomputed sqrt table
_longnum_log2[0]=sqrt(2)
_longnum_log2[1]=sqrt[tab[0])
_longnum_log2[i]=sqrt(tab[i-1])
longnum::zero() sets *this=0
longnum::one() sets *this=+1
bool longnum::iszero() returns (*this==0)
bool longnum::isnonzero() returns (*this!=0)
bool longnum::isreal() returns (true if fractional part !=0)
bool longnum::isinteger() returns (true if fractional part ==0)
int longnum::bits() return num of used bits in number counted from LSB
longnum::bitget()/bitset()/bitres()/bitxor() are bit access
longnum.overflow() rounds number if there was a overflow X.FFFFFFFFFF...FFFFFFFFF??h -> (X+1).0000000000000...000000000h
int longnum::geq(x,y) is comparition |x|,|y| returns 0,1,2 for (<,>,==)
All you need to understand this code is that numbers in binary form consists of sum of powers of 2, when you need to compute 2^num then it can be rewritten as this
2^(b(-n)*2^(-n) + ... + b(+m)*2^(+m))
where n are fractional bits and m are integer bits. multiplication/division by 2 in binary form is simple bit shifting so if you put it all together you get code for exp2 similar to my. log2 is based on binaru search...changing the result bits from MSB to LSB until it matches searched value (very similar algorithm as for fast sqrt computation). hope this helps clarify things...

A lot of approaches are given in other answers. Here is something that I thought may be useful in case of integral powers.
In the case of integer power x of nx, the straightforward approach would take x-1 multiplications. In order to optimize this, we can use dynamic programming and reuse an earlier multiplication result to avoid all x multiplications. For example, in 59, we can, say, make batches of 3, i.e. calculate 53 once, get 125 and then cube 125 using the same logic, taking only 4 multiplcations in the process, instead of 8 multiplications with the straightforward way.
The question is what is the ideal size of the batch b so that the number of multiplications is minimum. So let's write the equation for this. If f(x,b) is the function representing the number of multiplications entailed in calculating nx using the above method, then
Explanation: A product of batch of p numbers will take p-1 multiplications. If we divide x multiplications into b batches, there would be (x/b)-1 multiplications required inside each batch, and b-1 multiplications required for all b batches.
Now we can calculate the first derivative of this function with respect to b and equate it to 0 to get the b for the least number of multiplications.
Now put back this value of b into the function f(x,b) to get the least number of multiplications:
For all positive x, this value is lesser than the multiplications by the straightforward way.

maybe you can use taylor series expansion. the Taylor series of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor series are equal near this point. Taylor's series are named after Brook Taylor who introduced them in 1715.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Dividing two integers and rounding up the result, without using floating point - c++

Assuming that both myIntNumber and myOtherInt are positive, you could do: int myValue = (myIntNumber + myOtherInt - 1) / myOtherInt;

Maybe it is just easier to do a: int result = dividend / divisor; if(dividend % divisor != 0) result++;

Just use int ceil_of_division = ((dividend-1)/divisor)+1; For example: for (int i=0;i<20;i++) std::cout << i << "/8 = " << ((i-1)/8)+1 << std::endl;

A small hack is to do: int divideUp(int a, int b) { result = (a-1)/b + 1; } // Proof: a = bN + k (always) if k == 0, then (a-1) == bN - 1 (a-1)/b == N - 1 (a-1)/b + 1 == N ---> Good ! if k > 0, then (a-1) == b*N + l (a-1)/b == N (a-1)/b + 1 == N+1 ---> Good !

Related

Understanding this way of computing ( x^e mod n)

Newton-Raphson Division With Big Integers

Is div function useful (stdlib.h)? [duplicate]

C++ Should this be easier?

How can I write a power function myself?

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Dividing two integers and rounding up the result, without using floating point - c++

Assuming that both myIntNumber and myOtherInt are positive, you could do: int myValue = (myIntNumber + myOtherInt - 1) / myOtherInt;

Maybe it is just easier to do a: int result = dividend / divisor; if(dividend % divisor != 0) result++;

Just use int ceil_of_division = ((dividend-1)/divisor)+1; For example: for (int i=0;i<20;i++) std::cout << i << "/8 = " << ((i-1)/8)+1 << std::endl;

A small hack is to do: int divideUp(int a, int b) { result = (a-1)/b + 1; } // Proof: a = b*N + k (always) if k == 0, then (a-1) == b*N - 1 (a-1)/b == N - 1 (a-1)/b + 1 == N ---> Good ! if k > 0, then (a-1) == b*N + l (a-1)/b == N (a-1)/b + 1 == N+1 ---> Good !

Related

Understanding this way of computing ( x^e mod n)

Newton-Raphson Division With Big Integers

Is div function useful (stdlib.h)? [duplicate]

C++ Should this be easier?

How can I write a power function myself?

Categories

Resources

A small hack is to do: int divideUp(int a, int b) { result = (a-1)/b + 1; } // Proof: a = bN + k (always) if k == 0, then (a-1) == bN - 1 (a-1)/b == N - 1 (a-1)/b + 1 == N ---> Good ! if k > 0, then (a-1) == b*N + l (a-1)/b == N (a-1)/b + 1 == N+1 ---> Good !