Related
Why is X % 0 an invalid expression?
I always thought X % 0 should equal X. Since you can't divide by zero, shouldn't the answer naturally be the remainder, X (everything left over)?
The C++ Standard(2003) says in §5.6/4,
[...] If the second operand of / or % is zero the behavior is undefined; [...]
That is, following expressions invoke undefined-behavior(UB):
X / 0; //UB
X % 0; //UB
Note also that -5 % 2 is NOT equal to -(5 % 2) (as Petar seems to suggest in his comment to his answer). It's implementation-defined. The spec says (§5.6/4),
[...] If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined.
This answer is not for the mathematician. This answer attempts to give motivation (at the cost of mathematical precision).
Mathematicians: See here.
Programmers: Remember that division by 0 is undefined. Therefore, mod, which relies on division, is also undefined.
This represents division for positive X and D; it's made up of the integral part and fractional part:
(X / D) = integer + fraction
= floor(X / D) + (X % D) / D
Rearranging, you get:
(X % D) = D * (X / D) - D * floor(X / D)
Substituting 0 for D:
(X % 0) = 0 * (X / 0) - 0 * floor(X / 0)
Since division by 0 is undefined:
(X % 0) = 0 * undefined - 0 * floor(undefined)
= undefined - undefined
= undefined
X % D is by definition a number 0 <= R < D, such that there exists Q so that
X = D*Q + R
So if D = 0, no such number can exists (because 0 <= R < 0)
I think because to get the remainder of X % 0 you need to first calculate X / 0 which yields infinity, and trying to calculate the remainder of infinity is not really possible.
However, the best solution in line with your thinking would be to do something like this
REMAIN = Y ? X % Y : X
Another way that might be conceptually easy to understand the issue:
Ignoring for the moment the issue of argument sign, a % b could easily be re-written as a - ((a / b) * b). The expression a / b is undefined if b is zero, so in that case the overall expression must be too.
In the end, modulus is effectively a divisive operation, so if a / b is undefined, it's not unreasonable to expect a % b to be as well.
X % Y gives a result in the integer [ 0, Y ) range. X % 0 would have to give a result greater or equal to zero, and less than zero.
you can evade the "divivion by 0" case of (A%B) for its type float identity mod(a,b) for float(B)=b=0.0 , that is undefined, or defined differently between any 2 implementations, to avoid logic errors (hard crashes) in favor of arithmetic errors...
by computing mod([a*b],[b])==b*(a-floor(a))
INSTREAD OF
computing mod([a],[b])
where [a*b]==your x-axis, over time
[b] == the maximum of the seesaw curve (that will never be reached) == the first derivative of the seesaw function
https://www.shadertoy.com/view/MslfW8
I suppose because to get the remainder of X % 0 you need to first calculate X / 0 which yields infinity, and trying to calculate the remainder of infinity is not really possible.
However, the best solution in line with your thinking would be to do something like this,
ans = Y ? X % Y : X
Also, in C++ docs its written that X % 0 or X / 0 ,results in an undefined value.
How computers divide:
Start with the dividend and subtract the divisor until the result is less then the divisor. The number of times you subtracted is the result and what you have left is the remainder. For example, to divide 10 and 3:
10 - 3 = 7
7 - 3 = 4
4 - 3 = 1
So
10 / 3 = 3
10 % 3 = 1
To divide 1 and 0:
1 / 0
1 - 0 = 1
1 - 0 = 1
1 - 0 = 1
...
So
1 / 0 = Infinity (technically even infinity is too small, but it's easy to classify it as that)
1 % 0 = NaN
If there is nothing to stop it, the CPU will continue to execute this until it overloads and returns a totally random result. So there is an instruction at the CPU level that if the divisor is 0, return NaN or Infinity (depending on your platform).
This will never end so the remainder is undefined (which is NaN for computers).
Why is X % 0 an invalid expression?
I always thought X % 0 should equal X. Since you can't divide by zero, shouldn't the answer naturally be the remainder, X (everything left over)?
The C++ Standard(2003) says in §5.6/4,
[...] If the second operand of / or % is zero the behavior is undefined; [...]
That is, following expressions invoke undefined-behavior(UB):
X / 0; //UB
X % 0; //UB
Note also that -5 % 2 is NOT equal to -(5 % 2) (as Petar seems to suggest in his comment to his answer). It's implementation-defined. The spec says (§5.6/4),
[...] If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined.
This answer is not for the mathematician. This answer attempts to give motivation (at the cost of mathematical precision).
Mathematicians: See here.
Programmers: Remember that division by 0 is undefined. Therefore, mod, which relies on division, is also undefined.
This represents division for positive X and D; it's made up of the integral part and fractional part:
(X / D) = integer + fraction
= floor(X / D) + (X % D) / D
Rearranging, you get:
(X % D) = D * (X / D) - D * floor(X / D)
Substituting 0 for D:
(X % 0) = 0 * (X / 0) - 0 * floor(X / 0)
Since division by 0 is undefined:
(X % 0) = 0 * undefined - 0 * floor(undefined)
= undefined - undefined
= undefined
X % D is by definition a number 0 <= R < D, such that there exists Q so that
X = D*Q + R
So if D = 0, no such number can exists (because 0 <= R < 0)
I think because to get the remainder of X % 0 you need to first calculate X / 0 which yields infinity, and trying to calculate the remainder of infinity is not really possible.
However, the best solution in line with your thinking would be to do something like this
REMAIN = Y ? X % Y : X
Another way that might be conceptually easy to understand the issue:
Ignoring for the moment the issue of argument sign, a % b could easily be re-written as a - ((a / b) * b). The expression a / b is undefined if b is zero, so in that case the overall expression must be too.
In the end, modulus is effectively a divisive operation, so if a / b is undefined, it's not unreasonable to expect a % b to be as well.
X % Y gives a result in the integer [ 0, Y ) range. X % 0 would have to give a result greater or equal to zero, and less than zero.
you can evade the "divivion by 0" case of (A%B) for its type float identity mod(a,b) for float(B)=b=0.0 , that is undefined, or defined differently between any 2 implementations, to avoid logic errors (hard crashes) in favor of arithmetic errors...
by computing mod([a*b],[b])==b*(a-floor(a))
INSTREAD OF
computing mod([a],[b])
where [a*b]==your x-axis, over time
[b] == the maximum of the seesaw curve (that will never be reached) == the first derivative of the seesaw function
https://www.shadertoy.com/view/MslfW8
I suppose because to get the remainder of X % 0 you need to first calculate X / 0 which yields infinity, and trying to calculate the remainder of infinity is not really possible.
However, the best solution in line with your thinking would be to do something like this,
ans = Y ? X % Y : X
Also, in C++ docs its written that X % 0 or X / 0 ,results in an undefined value.
How computers divide:
Start with the dividend and subtract the divisor until the result is less then the divisor. The number of times you subtracted is the result and what you have left is the remainder. For example, to divide 10 and 3:
10 - 3 = 7
7 - 3 = 4
4 - 3 = 1
So
10 / 3 = 3
10 % 3 = 1
To divide 1 and 0:
1 / 0
1 - 0 = 1
1 - 0 = 1
1 - 0 = 1
...
So
1 / 0 = Infinity (technically even infinity is too small, but it's easy to classify it as that)
1 % 0 = NaN
If there is nothing to stop it, the CPU will continue to execute this until it overloads and returns a totally random result. So there is an instruction at the CPU level that if the divisor is 0, return NaN or Infinity (depending on your platform).
This will never end so the remainder is undefined (which is NaN for computers).
What are the fastest divisibility tests? Say, given a little-endian architecture and a 32-bit signed integer: how to calculate very fast that a number is divisible by 2,3,4,5,... up to 16?
WARNING: given code is EXAMPLE only. Every line is independent! Just obvious solution using modulo operation is slow on many processors, which don't have DIV hardware (like many ARMs). Some compilers are also cannot make such optimizations (say, if divisor is a function's argument or is dependent on something).
Divisible_by_1 = do();
Divisible_by_2 = if (!(number & 1)) do();
Divisible_by_3 = ?
Divisible_by_4 = ?
Divisible_by_5 = ?
Divisible_by_6 = ?
Divisible_by_7 = ?
Divisible_by_8 = ?
Divisible_by_9 = ?
Divisible_by_10 = ?
Divisible_by_11 = ?
Divisible_by_12 = ?
Divisible_by_13 = ?
Divisible_by_14 = ?
Divisible_by_15 = ?
Divisible_by_16 = if(!number & 0x0000000F) do();
and special cases:
Divisible_by_2k = if(number & (tk-1)) do(); //tk=2**k=(2*2*2*...) k times
In every case (including divisible by 2):
if (number % n == 0) do();
Anding with a mask of low order bits is just obfuscation, and with a modern compiler will not be any faster than writing the code in a readable fashion.
If you have to test all of the cases, you might improve performance by putting some of the cases in the if for another: there's no point it testing for divisibility by 4 if divisibility by 2 has already failed, for example.
It is not a bad idea AT ALL to figure out alternatives to division instructions (which includes modulo on x86/x64) because they are very slow. Slower (or even much slower) than most people realize. Those suggesting "% n" where n is a variable are giving foolish advice because it will invariably lead to the use of the division instruction. On the other hand "% c" (where c is a constant) will allow the compiler to determine the best algorithm available in its repertoire. Sometimes it will be the division instruction but a lot of the time it won't.
In this document Torbjörn Granlund shows that the ratio of clock cycles required for unsigned 32-bit mults:divs is 4:26 (6.5x) on Sandybridge and 3:45 (15x) on K10. for 64-bit the respective ratios are 4:92 (23x) and 5:77 (14.4x).
The "L" columns denote latency. "T" columns denote throughput. This has to do with the processor's ability to handle multiple instructions in parallell. Sandybridge can issue one 32-bit multiplication every other cycle or one 64-bit every cycle. For K10 the corresponding throughput is reversed. For divisions the K10 needs to complete the entire sequence before it may begin another. I suspect it is the same for Sandybridge.
Using the K10 as an example it means that during the cycles required for a 32-bit division (45) the same number (45) of multiplications can be issued and the next-to-last and last one of these will complete one and two clock cycles after the division has completed. A LOT of work can be performed in 45 multiplications.
It is also interesting to note that divs have become less efficient with the evolution from K8-K9 to K10: from 39 to 45 and 71 to 77 clock cycles for 32- and 64-bit.
Granlund's page at gmplib.org and at the Royal Institute of Technology in Stockholm contain more goodies, some of which have been incorporated into the gcc compiler.
As #James mentioned, let the compiler simplify it for you. If n is a constant, any decent compiler is able to recognize the pattern and change it to a more efficient equivalent.
For example, the code
#include <stdio.h>
int main() {
size_t x;
scanf("%u\n", &x);
__asm__ volatile ("nop;nop;nop;nop;nop;");
const char* volatile foo = (x%3 == 0) ? "yes" : "no";
__asm__ volatile ("nop;nop;nop;nop;nop;");
printf("%s\n", foo);
return 0;
}
compiled with g++-4.5 -O3, the relevant part of x%3 == 0 will become
mov rcx,QWORD PTR [rbp-0x8] # rbp-0x8 = &x
mov rdx,0xaaaaaaaaaaaaaaab
mov rax,rcx
mul rdx
lea rax,"yes"
shr rdx,1
lea rdx,[rdx+rdx*2]
cmp rcx,rdx
lea rdx,"no"
cmovne rax,rdx
mov QWORD PTR [rbp-0x10],rax
which, translated back to C code, means
(hi64bit(x * 0xaaaaaaaaaaaaaaab) / 2) * 3 == x ? "yes" : "no"
// equivalatent to: x % 3 == 0 ? "yes" : "no"
no division involved here. (Note that 0xaaaaaaaaaaaaaaab == 0x20000000000000001L/3)
Edit:
The magic constant 0xaaaaaaaaaaaaaaab can be computed in http://www.hackersdelight.org/magic.htm
For divisors of the form 2n - 1, check http://graphics.stanford.edu/~seander/bithacks.html#ModulusDivision
A bit tongue in cheek, but assuming you get the rest of the answers:
Divisible_by_6 = Divisible_by_3 && Divisible_by_2;
Divisible_by_10 = Divisible_by_5 && Divisible_by_2;
Divisible_by_12 = Divisible_by_4 && Divisible_by_3;
Divisible_by_14 = Divisible_by_7 && Divisible_by_2;
Divisible_by_15 = Divisible_by_5 && Divisible_by_3;
Assume number is unsigned (32-bits). Then the following are very fast ways to compute divisibility up to 16. (I haven't measured but the assembly code indicates so.)
bool divisible_by_2 = number % 2 == 0;
bool divisible_by_3 = number * 2863311531u <= 1431655765u;
bool divisible_by_4 = number % 4 == 0;
bool divisible_by_5 = number * 3435973837u <= 858993459u;
bool divisible_by_6 = divisible_by_2 && divisible_by_3;
bool divisible_by_7 = number * 3067833783u <= 613566756u;
bool divisible_by_8 = number % 8 == 0;
bool divisible_by_9 = number * 954437177u <= 477218588u;
bool divisible_by_10 = divisible_by_2 && divisible_by_5;
bool divisible_by_11 = number * 3123612579u <= 390451572u;
bool divisible_by_12 = divisible_by_3 && divisible_by_4;
bool divisible_by_13 = number * 3303820997u <= 330382099u;
bool divisible_by_14 = divisible_by_2 && divisible_by_7;
bool divisible_by_15 = number * 4008636143u <= 286331153u;
bool divisible_by_16 = number % 16 == 0;
Regarding divisibility by d the following rules hold:
When d is a power of 2:
As pointed out by James Kanze, you can use is_divisible_by_d = (number % d == 0). Compilers are clever enough to implement this as (number & (d - 1)) == 0 which is very efficient but obfuscated.
However, when d is not a power of 2 it looks like the obfuscations shown above are more efficient than what current compilers do. (More on that later).
When d is odd:
The technique takes the form is_divisible_by_d = number * a <= b where a and b are cleverly obtained constants. Notice that all we need is 1 multiplication and 1 comparison:
When d is even but not a power of 2:
Then, write d = p * q where p is a power of 2 and q is odd and use the "tongue in cheek" suggested by unpythonic, that is, is_divisible_by_d = is_divisible_by_p && is_divisible_by_q. Again, only 1 multiplication (in the calculation of is_divisible_by_q) is performed.
Many compilers (I've tested clang 5.0.0, gcc 7.3, icc 18 and msvc 19 using godbolt) replace number % d == 0 by (number / d) * d == number. They use a clever technique (see references in Olof Forshell's answer) to replace the division by a multiplication and a bit shift. They end up doing 2 multiplications. In contrast the techniques above perform only 1 multiplication.
Update 01-Oct-2018
Looks like the algorithm above is coming to GCC soon (already in trunk):
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82853
The GCC's implementation seems even more efficient. Indeed, the implementation above has three parts: 1) divisibility by the divisor's even part; 2) divisibility by the divisor's odd part; 3) && to connect the results of the two previous steps. By using an assembler instruction which is not efficiently available in standard C++ (ror), GCC wraps up the three parts into a single one which is very similar to that of divisibility by the odd part. Great stuff! Having this implementation available, it's better (for both clarity and performance) to fall back to % all times.
Update 05-May-2020
My articles on the subject have been published:
Quick Modular Calculations (Part 1), Overload Journal 154, December 2019, pages 11-15.
Quick Modular Calculations (Part 2), Overload Journal 155, February 2020, pages 14-17.
Quick Modular Calculations (Part 3), Overload Journal 156, April 2020, pages 10-13.
First of all, I remind you that a number in the form bn...b2b1b0 in binary has value:
number = bn*2^n+...+b2*4+b1*2+b0
Now, when you say number%3, you have:
number%3 =3= bn*(2^n % 3)+...+b2*1+b1*2+b0
(I used =3= to indicate congruence modulo 3). Note also that b1*2 =3= -b1*1
Now I will write all the 16 divisions using + and - and possibly multiplication (note that multiplication could be written as shift or sum of same value shifted to different locations. For example 5*x means x+(x<<2) in which you compute x once only)
Let's call the number n and let's say Divisible_by_i is a boolean value. As an intermediate value, imagine Congruence_by_i is a value congruent to n modulo i.
Also, lets say n0 means bit zero of n, n1 means bit 1 etc, that is
ni = (n >> i) & 1;
Congruence_by_1 = 0
Congruence_by_2 = n&0x1
Congruence_by_3 = n0-n1+n2-n3+n4-n5+n6-n7+n8-n9+n10-n11+n12-n13+n14-n15+n16-n17+n18-n19+n20-n21+n22-n23+n24-n25+n26-n27+n28-n29+n30-n31
Congruence_by_4 = n&0x3
Congruence_by_5 = n0+2*n1-n2-2*n3+n4+2*n5-n6-2*n7+n8+2*n9-n10-2*n11+n12+2*n13-n14-2*n15+n16+2*n17-n18-2*n19+n20+2*n21-n22-2*n23+n24+2*n25-n26-2*n27+n28+2*n29-n30-2*n31
Congruence_by_7 = n0+2*n1+4*n2+n3+2*n4+4*n5+n6+2*n7+4*n8+n9+2*n10+4*n11+n12+2*n13+4*n14+n15+2*n16+4*n17+n18+2*n19+4*n20+n21+2*n22+4*n23+n24+2*n25+4*n26+n27+2*n28+4*n29+n30+2*n31
Congruence_by_8 = n&0x7
Congruence_by_9 = n0+2*n1+4*n2-n3-2*n4-4*n5+n6+2*n7+4*n8-n9-2*n10-4*n11+n12+2*n13+4*n14-n15-2*n16-4*n17+n18+2*n19+4*n20-n21-2*n22-4*n23+n24+2*n25+4*n26-n27-2*n28-4*n29+n30+2*n31
Congruence_by_11 = n0+2*n1+4*n2+8*n3+5*n4-n5-2*n6-4*n7-8*n8-5*n9+n10+2*n11+4*n12+8*n13+5*n14-n15-2*n16-4*n17-8*n18-5*n19+n20+2*n21+4*n22+8*n23+5*n24-n25-2*n26-4*n27-8*n28-5*n29+n30+2*n31
Congruence_by_13 = n0+2*n1+4*n2+8*n3+3*n4+6*n5-n6-2*n7-4*n8-8*n9-3*n10-6*n11+n12+2*n13+4*n14+8*n15+3*n16+6*n17-n18-2*n19-4*n20-8*n21-3*n22-6*n3+n24+2*n25+4*n26+8*n27+3*n28+6*n29-n30-2*n31
Congruence_by_16 = n&0xF
Or when factorized:
Congruence_by_1 = 0
Congruence_by_2 = n&0x1
Congruence_by_3 = (n0+n2+n4+n6+n8+n10+n12+n14+n16+n18+n20+n22+n24+n26+n28+n30)-(n1+n3+n5+n7+n9+n11+n13+n15+n17+n19+n21+n23+n25+n27+n29+n31)
Congruence_by_4 = n&0x3
Congruence_by_5 = n0+n4+n8+n12+n16+n20+n24+n28-(n2+n6+n10+n14+n18+n22+n26+n30)+2*(n1+n5+n9+n13+n17+n21+n25+n29-(n3+n7+n11+n15+n19+n23+n27+n31))
Congruence_by_7 = n0+n3+n6+n9+n12+n15+n18+n21+n24+n27+n30+2*(n1+n4+n7+n10+n13+n16+n19+n22+n25+n28+n31)+4*(n2+n5+n8+n11+n14+n17+n20+n23+n26+n29)
Congruence_by_8 = n&0x7
Congruence_by_9 = n0+n6+n12+n18+n24+n30-(n3+n9+n15+n21+n27)+2*(n1+n7+n13+n19+n25+n31-(n4+n10+n16+n22+n28))+4*(n2+n8+n14+n20+n26-(n5+n11+n17+n23+n29))
// and so on
If these values end up being negative, add it with i until they become positive.
Now what you should do is recursively feed these values through the same process we just did until Congruence_by_i becomes less than i (and obviously >= 0). This is similar to what we do when we want to find remainder of a number by 3 or 9, remember? Sum up the digits, if it had more than one digit, some up the digits of the result again until you get only one digit.
Now for i = 1, 2, 3, 4, 5, 7, 8, 9, 11, 13, 16:
Divisible_by_i = (Congruence_by_i == 0);
And for the rest:
Divisible_by_6 = Divisible_by_3 && Divisible_by_2;
Divisible_by_10 = Divisible_by_5 && Divisible_by_2;
Divisible_by_12 = Divisible_by_4 && Divisible_by_3;
Divisible_by_14 = Divisible_by_7 && Divisible_by_2;
Divisible_by_15 = Divisible_by_5 && Divisible_by_3;
Edit: Note that some of the additions could be avoided from the very beginning. For example n0+2*n1+4*n2 is the same as n&0x7, similarly n3+2*n4+4*n5 is (n>>3)&0x7 and thus with each formula, you don't have to get each bit individually, I wrote it like that for the sake of clarity and similarity in operation. To optimize each of the formulas, you should work on it yourself; group operands and factorize operation.
The LCM of these numbers seems to be 720720. Its quite small, so that you can perform a single modulus operation and use the remainder as the index in the precomputed LUT.
You should just use (i % N) == 0 as your test.
My compiler (a fairly old version of gcc) generated good code for all the cases I tried.
Where bit tests were appropriate it did that. Where N was a constant it didn't generate the obvious "divide" for any case, it always used some "trick".
Just let the compiler generate the code for you, it will almost certainly know more about the architecture of the machine than you do :) And these are easy optimisations where you are unlikely to think up something better than the compiler does.
It's an interesting question though. I can't list the tricks used by the compiler for each constant as I have to compile on a different computer.. But I'll update this reply later on if nobody beats me to it :)
This probably won't help you in code, but there's a neat trick which can help do this in your head in some cases:
For divide by 3: For a number represented in decimal, you can sum all the digits, and check if the sum is divisible by 3.
Example: 12345 => 1+2+3+4+5 = 15 => 1+5 = 6, which is divisible by 3 (3 x 4115 = 12345).
More interestingly the same technique works for all factors of X-1, where X is the base in which the number is represented. So for decimal number, you can check divide by 3 or 9. For hex, you can check divide by 3,5 or 15. And for octal numbers, you can check divide by 7.
In a previous question, I showed a fast algorithm to check in base N for divisors that are factors of N-1. Base transformations between different powers of 2 are trivial; that's just bit grouping.
Therefore, checking for 3 is easy in base 4; checking for 5 is easy in base 16, and checking for 7 (and 9) is easy in base 64.
Non-prime divisors are trivial, so only 11 and 13 are hard cases. For 11, you could use base 1024, but at that point it's not really efficient for small integers.
A method that can help modulo reduction of all integer values uses bit-slicing and popcount.
mod3 = pop(x & 0x55555555) + pop(x & 0xaaaaaaaa) << 1; // <- one term is shared!
mod5 = pop(x & 0x99999999) + pop(x & 0xaaaaaaaa) << 1 + pop(x & 0x44444444) << 2;
mod7 = pop(x & 0x49249249) + pop(x & 0x92492492) << 1 + pop(x & 0x24924924) << 2;
modB = pop(x & 0x5d1745d1) + pop(x & 0xba2e8ba2) << 1 +
pop(x & 0x294a5294) << 2 + pop(x & 0x0681a068) << 3;
modD = pop(x & 0x91b91b91) + pop(x & 0xb2cb2cb2) << 1 +
pop(x & 0x64a64a64) << 2 + pop(x & 0xc85c85c8) << 3;
The maximum values for these variables are 48, 80, 73, 168 and 203, which all fit into 8-bit variables. The second round can be carried in parallel (or some LUT method can be applied)
mod3 mod3 mod5 mod5 mod5 mod7 mod7 mod7 modB modB modB modB modD modD modD modD
mask 0x55 0xaa 0x99 0xaa 0x44 0x49 0x92 0x24 0xd1 0xa2 0x94 0x68 0x91 0xb2 0x64 0xc8
shift *1 *2 *1 *2 *4 *1 *2 *4 *1 *2 *4 *8 *1 *2 *4 *8
sum <-------> <------------> <-----------> <-----------------> <----------------->
You can replace division by a non-power-of-two constant by a multiplication, essentially multiplying by the reciprocal of your divisor. The details to get the exact result by this method are complicated.
Hacker's Delight discusses this at length in chapter 10 (unfortunately not available online).
From the quotient you can get the modulus by another multiplication and a subtraction.
One thing to consider: since you only care about divisibility up to 16, you really only need to check divisibility by the primes up to 16. These are 2, 3, 5, 7, 11, and 13.
Divide your number by each of the primes, keeping track with a boolean (such as div2 = true). The numbers two and three are special cases. If div3 is true, try dividing by 3 again, setting div9. Two and its powers are very simple (note: '&' is one of the fastest things a processor can do):
if n & 1 == 0:
div2 = true
if n & 3 == 0:
div4 = true
if n & 7 == 0:
div8 = true
if n & 15 == 0:
div16 = true
You now have the booleans div2, div3, div4, div5, div7, div8, div9, div11, div13, and div16. All
other numbers are combinations; for instance div6 is the same as (div2 && div3)
So, you only need to do either 5 or 6 actual divisions (6 only if your number is divisible by 3).
For myself, i would probably use bits in a single register for my booleans; for instance
bit_0 means div2. I can then use masks:
if (flags & (div2+div3)) == (div2 + div3): do_6()
note that div2+div3 can be a precomputed constant. If div2 is bit0, and div3 is bit1,
then div2+div3 == 3. This makes the above 'if' optimize to:
if (flags & 3) == 3: do_6()
So now... mod without a divide:
def mod(n,m):
i = 0
while m < n:
m <<= 1
i += 1
while i > 0:
m >>= 1
if m <= n: n -= m
i -= 1
return n
div3 = mod(n,3) == 0
...
btw: the worst case for the above code is 31 times through either loop for a 32-bit number
FYI: Just looked at Msalter's post, above. His technique can be used instead of mod(...) for some of the primes.
Fast tests for divisibility depend heavily on the base in which the number is represented. In case when base is 2, I think you can only do "fast tests" for divisibility by powers of 2. A binary number is divisible by 2n iff the last n binary digits of that number are 0. For other tests I don't think you can generally find anything faster than %.
A bit of evil, obfuscated bit-twiddling can get you divisbility by 15.
For a 32-bit unsigned number:
def mod_15ish(unsigned int x) {
// returns a number between 0 and 21 that is either x % 15
// or 15 + (x % 15), and returns 0 only for x == 0
x = (x & 0xF0F0F0F) + ((x >> 4) & 0xF0F0F0F);
x = (x & 0xFF00FF) + ((x >> 8) & 0xFF00FF);
x = (x & 0xFFFF) + ((x >> 16) & 0xFFFF);
// *1
x = (x & 0xF) + ((x >> 4) & 0xF);
return x;
}
def Divisible_by_15(unsigned int x) {
return ((x == 0) || (mod_15ish(x) == 15));
}
You can build similar divisibility routines for 3 and 5 based on mod_15ish.
If you have 64-bit unsigned ints to deal with, extend each constant above the *1 line in the obvious way, and add a line above the *1 line to do a right shift by 32 bits with a mask of 0xFFFFFFFF. (The last two lines can stay the same) mod_15ish then obeys the same basic contract, but the return value is now between 0 and 31. (so what's maintained is that x % 15 == mod_15ish(x) % 15)
Here are some tips I haven't see anyone else suggest yet:
One idea is to use a switch statement, or precompute some array. Then, any decent optimizer can simply index each case directly. For example:
// tests for (2,3,4,5,6,7)
switch (n % 8)
{
case 0: break;
case 1: break;
case 2: do(2); break;
case 3: do(3); break;
case 4: do(2); do(4) break;
case 5: do(5); break;
case 6: do(2); do(3); do(4); break;
case 7: do(7); break;
}
Your application is a bit ambiguous, but you may only need to check prime numbers less than n=16. This is because all numbers are factors of the current or previous prime numbers. So for n=16, you might be able to get away with only checking 2, 3, 5, 7, 11, 13 somehow. Just a thought.
I was asked to generate a random number between a and b, inclusive, using random(0,1). random(0,1) generates a uniform random number between 0 and 1.
I answered
(a+(((1+random(0,1))*b))%(b-a))
My interviewer was not satisfied with my usage of b in this piece of the expression:
(((1+random(0,1))*b))
Then I tried changing my answer to:
int*z=(int*)malloc(sizeof(int));
(a+(((1+random(0,1))*(*z)))%(b-a));
Later the question changed to generate random(1,7) from random(1,5). I responded with:
A = rand(1,5)%3
B = (rand(1,5)+1)%3
C = (rand(1,5)+2)%3
rand(1,7) = rand(1,5)+ (A+B+C)%3
Were my answers correct?
I think you were confused between random integral-number generator and random floating-point number generator. In C++, rand() generates random integral number between 0 and 32K. Thus to generate a random number from 1 to 10, we write rand() % 10 + 1. As such, to generate a random number from integer a to integer b, we write rand() % (b - a + 1) + a.
The interviewer told you that you had a random generator from 0 to 1. It means floating-point number generator.
How to get the answer mathematically:
Shift the question to a simple form such that the lower bound is 0.
Scale the range by multiplication
Re-shift to the required range.
For example: to generate R such that
a <= R <= b.
Apply rule 1, we get a-a <= R - a <= b-a
0 <= R - a <= b - a.
Think R - a as R1. How to generate R1 such that R1 has range from 0 to (b-a)?
R1 = rand(0, 1) * (b-a) // by apply rule 2.
Now substitute R1 by R - a
R - a = rand(0,1) * (b-a) ==> R = a + rand(0,1) * (b-a)
==== 2nd question - without explanation ====
We have 1 <= R1 <= 5
==> 0 <= R1 - 1 <= 4
==> 0 <= (R1 - 1)/4 <= 1
==> 0 <= 6 * (R1 - 1)/4 <= 6
==> 1 <= 1 + 6 * (R1 - 1)/4 <= 7
Thus, Rand(1,7) = 1 + 6 * (rand(1,5) - 1) / 4
random(a,b) from random(0,1):
random(0,1)*(b-a)+a
random(c,d) from random(a,b):
(random(a,b)-a)/(b-a)*(d-c)+c
or, simplified for your case (a=1,b=5,c=1,d=7):
random(1,5) * 1.5 - 0.5
(note: I assume we're talking about float values and that rounding errors are negligible)
random(a,b) from random(c,d) = a + (b-a)*((random(c,d) - c)/(d-c))
No?
[random(0,1)*(b-a)] + a, i think would give random numbers b/w a&b.
([random(1,5)-1]/4)*6 + 1 should give the random nubers in the range (1,7)
I am not sure whether the above will destroy the uniform distribution..
Were my answers correct?
I think there are some problems.
First off, I'm assuming that random() returns a floating point value - otherwise to generate any useful distribution of a larger range of numbers using random(0,1) would require repeated calls to generate a pool of bits to work with.
I'm also going to assume C/C++ is the intended platform, since the question is tagged as such.
Given these assumptions, one problem with your answers is that C/C++ do not allow the use of the % operator on floating point types.
But even if we imagine that the % operator was replaced with a function that performed a modulo operation with floating point arguments in a reasonable way, there are still some problems. In your initial answer, if b (or the uninitialized *z allocated in your second attempt - I'm assuming this is a kind of bizarre way to get an arbitrary value, or is something else intended?) is zero (say the range given for a and b is (-5, 0)), then your result will be decidedly non-uniform. The result would always be b.
Finally, I'm certainly no statistician, but in your final answer (to generate random(1,7) from random(1.5)), I'm pretty sure that A+B+C would be non-uniform and would therefore introduce a bias in the result.
I think that there is a nicer answer to this. There is one value (probability -> zero) that this overflows and thus the modulus is there.
Take a random number x in the interval [0,1].
Increment your upper_bound which could be a parameter by one.
Calculate (int(random() / (1.0 / upper_bound)) % upper_bound) + 1 + lower_bound .
This ought to return a number in your desired interval.
given random(0,5) you can generate random(0,7) in the following way
A = random(0,5)*random(0,5)
now the range of A is 0-25
if we simply take the modulo 7 of A, we can get the random numbers but they wont be truly random as for values of A from 22-25, you will get 1-4 values after modulo operation, hence getting modulo 7 from range(0,25) will bias the output towards 1-4. This is because 7 does not evenly divide 25: the largest multiple of 7 less than or equal to 25 is 7*3=21 and it is the numbers in the incomplete range from 21-25 that will cause the bias.
Easiest way to fix this problem is to discard those numbers (from 22-25) and to keep tying again until a number in the suitable range come up.
Obviously, this is true when we assume that we want random integers.
However to get random float numbers we need to modify the range accordingly as described in above posts.
I am trying to implement a simple decimation algorithm in c++. I have two arrays, say p & q, where the subscripts are related to each other by the following relation:
p[k] = q[0.5*k]. This means that the following sequence should hold valid:
p[0] = q[0]
p[1] = 0
p[2] = q[1]
p[3] = 0
p[4] = q[2]
and so on...
Please note that p[k] takes on a value only and only when the result of (0.5*k) contains no decimal places (or has 0 in decimal) and does not use any rounding off etc.
My question is: Is there a way to distinguish between an integer (a number with no decimal places or only 0 in decimal, say 2.0) and a number with decimal places in C++, provided both are cast to double?
eg.) 2.0 is an integer cast to double. 2.1 is a number with decimal places.
eg. 2) * 0.9*2 should put 0 into array p while 0.9*10 should put q[9] into array p.*
If I use the statement, (int) (0.5*k), then I end up with an integer in every case, irrespective of the value of k.
Edit: The 0.5 in the above case is only illustrative. It could be any number, say 2, 2.5, 0.9, 0.95 etc.)
Any help is most welcome,
Thanks,
Sriram.
Assuming k is of an integer type, you could use if (k % 2 == 0) ... to check if kis divisible by two:
if (k % 2 == 0)
p[k] = q[k / 2];
else
p[k] = 0;
This can also be expressed using the ternary operator:
p[k] = (k % 2 == 0) ? q[k / 2] : 0;
Presuming that the coef can be anything else,
p[floor(coef*k)] = (fabs(coef*k-floor(coef*k))<1E-6)?q[k]:0;
The short syntax for what you want to do could be this:
p[k] = k % 2 ? 0 : q[k/2];
Is there a way to distinguish between a whole number and an integer in C++?
Define whole number, and define integer in this context. I'm confused!
Are you taking about the difference as explained here?
If you want to detect whether a number is integer or not, then probably this may help:
#include<cmath>
bool IsInteger(double d)
{
double intpart;
return std::modf(double number, &intpart) == 0.0;
}
k % 2 is in a couple of answers in this thread.
However, this is not useful in answering the OP's question. Note the edit:
"Edit: The 0.5 in the above case is only illustrative. It could be any number, say 2, 2.5, 0.9, 0.95 etc.)"
k % 2 only works because the value chosen was 0.5. It won't hold true for any other values.
Therefore, unless I'm missing something entirely, the simplest approach I can think of is the following:
Subtract the floor of the number from the number itself. If the result is > 0, it is not an integer.
Unless you have expressions that result in irrational numbers, you could use Boost.Rational to represent your indizes.
#Aix's suggestion of k%2 looks like it'd combine nicely with the ?: operator:
p[k] = (k%2) ? 0 : q[k/2];