What's the difference between this two formulas
mid = low + (high - low) / 2;
mid = (high + low) / 2;
In the 2nd version, if high + low is greater than the maximum value of an int (assuming high is an int) then it can overflow, invoking undefined behavior. This particular bug is solved with the 1st version.
There are still issues with the 1st version, e.g. if low is a very large negative number, the difference can still overflow.
From c++20, you should use std::midpoint for this, which handles a whole bunch of corner cases, and does the right thing for all of them.
This seemingly simple function is actually surprisingly difficult to implement, and in fact, there's an hour long talk given by Marshall Clow at cppcon 2019, that covers the implementation of just this function.
The first one is superior (although still not perfect, see Binary Search: how to determine half of the array):
It works in cases where addition is not defined for high and low but is defined for adding an interval to low. Pointers are one such example, an object of a date type can be another.
high + low can overflow the type. For a signed integral type, the behaviour is undefined.
Both suffer from potential overflow. Signed integer overflow is undefined behavior (UB).
With unsigned math (often used in array indexing), then when low <= high, low + (high - low) / 2; does not overflow unlike potentially (high + low) / 2.
Same with signed math when low <= high and 0 <= low.
To avoid any overflow with signed math (or unsigned math with low > high) and still use only int/unsigned math, I thought the below would work.
mid = high/2 + low/2 + (high%2 + low%2)/2;
Yet that can fail when the sign of high/2 + low/2 differs from sign of (high%2 + low%2).
A more robust and tested version is below. Perhaps I'll simplify later.
#include <limits.h>
#include <stdio.h>
int midpoint(int a, int b) {
int avg = a/2 + b/2;
int small_sum = a%2 + b%2;
avg += small_sum/2;
small_sum %= 2;
if (avg < 0) {
if (small_sum > 0) avg++;
} else if (avg > 0) {
if (small_sum < 0) avg--;
}
return avg;
}
int midpoint_test(int a, int b) {
intmax_t lavg = ((intmax_t)a + (intmax_t)b)/2;
int avg = midpoint(a,b);
printf("a:%12d b:%12d avg_wide_math:%12jd avg_midpoint:%12d\n", a,b,lavg,avg);
return lavg == avg;
}
int main(void) {
int a[] = {INT_MIN, INT_MIN+1, -100, -99, -2, -1, 0, 1, 2, 99, 100, INT_MAX-1, INT_MAX};
int n = sizeof a/ sizeof a[0];
for (int i=0; i<n; i++) {
for (int j=0; j<n; j++) {
if (midpoint_test(a[i], a[j]) == 0) {
puts("Oops");
return 1;
}
}
}
puts("Success");
return 0;
}
The two formulae are different:
both may overflow depending on the values of low and high.
even when there is no overflow, they do not necessarily produce the same result: the first computes the midpoint and the second computes the average of 2 numbers.
For the rest of the discussion, we shall assume that low, mid and high have the same type. We are looking for a safe way to find the midpoint or average between low and high, which is always in the range of the type.
The first formula, mid = low + (high - low) / 2; rounds toward low if the type is signed and may overflow if the type is signed and high and low are too far appart.
The second formula, mid = (high + low) / 2; rounds toward 0, but may overflow for large values of high and/or low for both signed and unsigned types.
In your particular application, computing the index of the middle element of a sorted array to perform binary search, the index values low and high are non-negative and low <= high. With this constraint, both formulas compute the same result, but the second can overflow whereas the first cannot.
Hence for your case, you should use mid = low + (high - low) / 2; as a safe replacement for mid = (high + low) / 2;.
In the general case, computing the average (second formula) without overflow is a tricky problem. Below is a set of solutions for the average formula, along with a test program inspired from chux' answer. They can be adapted for any signed integer type:
#include <limits.h>
#include <stdio.h>
#include <stdint.h>
int average_chqrlie(int a, int b) {
if (a <= b) {
if (a >= 0)
return a + ((b - a) >> 1);
if (b < 0)
return b - ((b - a) >> 1);
} else {
if (b >= 0)
return b + ((a - b) >> 1);
if (a < 0)
return a - ((a - b) >> 1);
}
return (a + b) / 2;
}
int average_chqrlie2(int a, int b) {
if (a > b) {
int tmp = a;
a = b;
b = tmp;
}
if (a >= 0)
return a + ((b - a) >> 1);
if (b < 0)
return b - ((b - a) >> 1);
return (a + b) / 2;
}
int average_chqrlie3(int a, int b) {
int half, mid;
if (a < b) {
half = (int)(((unsigned)b - (unsigned)a) / 2);
mid = a + half;
if (mid < 0)
mid = b - half;
} else {
half = (int)(((unsigned)a - (unsigned)b) / 2);
mid = b + half;
if (mid < 0)
mid = a - half;
}
return mid;
}
int average_chux(int a, int b) {
int avg = a / 2 + b / 2;
int small_sum = a % 2 + b % 2;
avg += small_sum / 2;
small_sum %= 2;
if (avg < 0) {
if (small_sum > 0)
avg++;
} else if (avg > 0) {
if (small_sum < 0)
avg--;
}
return avg;
}
int run_tests(const char *name, int (*fun)(int a, int b)) {
int array[] = { INT_MIN, INT_MIN+1, -100, -99, -2, -1, 0, 1, 2, 99, 100, INT_MAX-1, INT_MAX };
int n = sizeof(array) / sizeof(array[0]);
int status = 0;
printf("Testing %s:", name);
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
int a = array[i], b = array[j];
intmax_t lavg = ((intmax_t)a + (intmax_t)b) / 2; // assuming sizeof(intmax_t) > size(int)
int avg = fun(a, b);
if (lavg != avg) {
printf("\na:%12d b:%12d average_wide:%12jd average:%12d", a, b, lavg, avg);
status = 1;
}
}
}
puts(status ? "\nFailed" : " Success");
return status;
}
int main() {
run_tests("average_chqrlie", average_chqrlie);
run_tests("average_chqrlie2", average_chqrlie2);
run_tests("average_chqrlie3", average_chqrlie3);
run_tests("average_chux", average_chux);
return 0;
}
The first one will not result in overflow for a large value of low/high unlike second one. It's always preferred to use mid = low + (high - low) / 2.
Related
I was trying to build 17bit adder, when overflow occurs it should round off should appear just like int32.
eg: In int32 add, If a = 2^31 -1
int res = a+1
res= -2^31-1
Code I tried, this is not working & is there a better way. Do I need to convert decimal to binary & then perform 17bit operation
int addOvf(int32_t result, int32_t a, int32_t b)
{
int max = (-(0x01<<16))
int min = ((0x01<<16) -1)
int range_17bit = (0x01<<17);
if (a >= 0 && b >= 0 && (a > max - b)) {
printf("...OVERFLOW.........a=%0d b=%0d",a,b);
}
else if (a < 0 && b < 0 && (a < min - b)) {
printf("...UNDERFLOW.........a=%0d b=%0d",a,b);
}
result = a+b;
if(result<min) {
while(result<min){ result=result + range_17bit; }
}
else if(result>min){
while(result>max){ result=result - range_17bit; }
}
return result;
}
int main()
{
int32_t res,x,y;
x=-65536;
y=-1;
res =addOvf(res,x,y);
printf("Value of x=%0d y=%0d res=%0d",x,y,res);
return 0;
}
You have your constants for max/min int17 reversed and off by one. They should be
max_int17 = (1 << 16) - 1 = 65535
and
min_int17 = -(1 << 16) = -65536.
Then I believe that max_int_n + m == min_int_n + (m-1) and min_int_n - m == max_int_n - (m-1), where n is the bit count and m is some integer in [min_int_n, ... ,max_int_n]. So putting that all together the function to treat two int32's as though they are int17's and add them would be like
int32_t add_as_int17(int32_t a, int32_t b) {
static const int32_t max_int17 = (1 << 16) - 1;
static const int32_t min_int17 = -(1 << 16);
auto sum = a + b;
if (sum < min_int17) {
auto m = min_int17 - sum;
return max_int17 - (m - 1);
} else if (sum > max_int17) {
auto m = sum - max_int17;
return min_int17 + (m - 1);
}
return sum;
}
There is probably some more clever way to do that but I believe the above is correct, assuming I understand what you want.
The below code is to calculate 2^n where n is equal to 1 <= n <= 10^5. So to calculate such large numbers I have used concept of modular exponentian. The code is giving correct output but due to large number of test cases it is exceeding the time limit. I am not getting a way to minimize the solution so it consumes less time. As the "algo" function is called as many times as the number of test cases. So I want to put the logic used in "algo" function in the main() function so it consumes time less than 1 sec and also gives the correct output. Here "t" represents number of test cases and it's value is 1 <= t <= 10^5.
Any suggestions from your side would be of great help!!
#include<iostream>
#include<math.h>
using namespace std;
int algo(int x, int y){
long m = 1000000007;
if(y == 0){
return 1;
}
int k = algo(x,y/2);
if (y % 2 == 1){
return ((((1ll * k * k) % m) * x) % m);
} else if (y % 2 == 0){
return ((1ll * k * k) % m);
}
}
int main(void)
{
int n, t, k;
cin>>t; //t = number of test cases
for ( k = 0; k < t; k++)
{
cin >> n; //power of 2
cout<<"the value after algo is: "<<algo(2,n)<<endl;
}
return 0;
}
You can make use of binary shifts to find powers of two
#include <iostream>
using namespace std;
int main()
{
unsigned long long u = 1, w = 2, n = 10, p = 1000000007, r;
//n -> power of two
while (n != 0)
{
if ((n & 0x1) != 0)
u = (u * w) % p;
if ((n >>= 1) != 0)
w = (w * w) % p;
}
r = (unsigned long)u;
cout << r;
return 0;
}
This is the function that I often use to calculate
Any integer X raised to power Y modulo M
C++ Function to calculate (X^Y) mod M
int power(int x, int y, const int mod = 1e9+7)
{
int result = 1;
x = x % mod;
if (x == 0)
return 0;
while (y > 0)
{
if (y & 1)
result = ( (result % mod) * (x % mod) ) % mod;
y = y >> 1; // y = y / 2
x = ( (x % mod) * (x % mod) ) % mod;
}
return result;
}
Remove the Mod if you don't want.
Time Complexity of this Function is O(log2(Y))
There can be a case of over flow so use int , long , long long etc as per your need.
Well your variables won't sustain the boundary test cases, introducing 2^10000, 1 <= n <= 10^5. RIP algorithms
19950631168807583848837421626835850838234968318861924548520089498529438830221946631919961684036194597899331129423209124271556491349413781117593785932096323957855730046793794526765246551266059895520550086918193311542508608460618104685509074866089624888090489894838009253941633257850621568309473902556912388065225096643874441046759871626985453222868538161694315775629640762836880760732228535091641476183956381458969463899410840960536267821064621427333394036525565649530603142680234969400335934316651459297773279665775606172582031407994198179607378245683762280037302885487251900834464581454650557929601414833921615734588139257095379769119277800826957735674444123062018757836325502728323789270710373802866393031428133241401624195671690574061419654342324638801248856147305207431992259611796250130992860241708340807605932320161268492288496255841312844061536738951487114256315111089745514203313820202931640957596464756010405845841566072044962867016515061920631004186422275908670900574606417856951911456055068251250406007519842261898059237118054444788072906395242548339221982707404473162376760846613033778706039803413197133493654622700563169937455508241780972810983291314403571877524768509857276937926433221599399876886660808368837838027643282775172273657572744784112294389733810861607423253291974813120197604178281965697475898164531258434135959862784130128185406283476649088690521047580882615823961985770122407044330583075869039319604603404973156583208672105913300903752823415539745394397715257455290510212310947321610753474825740775273986348298498340756937955646638621874569499279016572103701364433135817214311791398222983845847334440270964182851005072927748364550578634501100852987812389473928699540834346158807043959118985815145779177143619698728131459483783202081474982171858011389071228250905826817436220577475921417653715687725614904582904992461028630081535583308130101987675856234343538955409175623400844887526162643568648833519463720377293240094456246923254350400678027273837755376406726898636241037491410966718557050759098100246789880178271925953381282421954028302759408448955014676668389697996886241636313376393903373455801407636741877711055384225739499110186468219696581651485130494222369947714763069155468217682876200362777257723781365331611196811280792669481887201298643660768551639860534602297871557517947385246369446923087894265948217008051120322365496288169035739121368338393591756418733850510970271613915439590991598154654417336311656936031122249937969999226781732358023111862644575299135758175008199839236284615249881088960232244362173771618086357015468484058622329792853875623486556440536962622018963571028812361567512543338303270029097668650568557157505516727518899194129711337690149916181315171544007728650573189557450920330185304847113818315407324053319038462084036421763703911550639789000742853672196280903477974533320468368795868580237952218629120080742819551317948157624448298518461509704888027274721574688131594750409732115080498190455803416826949787141316063210686391511681774304792596709376
Fear not my friend, someone did tried to solve the problem https://www.quora.com/What-is-2-raised-to-the-power-of-50-000, you are looking for Piyush Michael's answer , here is his sample code
#include <stdio.h>
int main()
{
int ul=16,000;
int rs=50,000;
int s=0,carry[ul],i,j,k,ar[ul];
ar[0]=2;
for(i=1;i<ul;i++)ar[i]=0;
for(j=1;j<rs;j++)
{for(k=0;k<ul;k++)carry[k]=0;
for(i=0;i<ul;i++)
{ar[i]=ar[i]*2+carry[i];
if(ar[i]>9)
{carry[i+1]=ar[i]/10;
ar[i]=ar[i]%10;
}
}
}
for(j=ul-1;j>=0;j--)printf("%d",ar[j]);
for(i=0;i<ul-1;i++)s+=ar[i];
printf("\n\n%d",s);
}
Currently the following problem is taking 3.008** seconds to execute for some testcase provided on hackerearth.com where allowed time is 3.0 seconds so i get time limit error. Please help to reduce execution time.
Problem:
Alice has just learnt multiplying two integers. He wants to multiply two integers X and Y to form a number Z.To make the problem interesting he will choose X in the range [1,M] and Y in the range [1,N].Help him to find the number of ways in which he can do this.
Input
First line of the input is the number of test cases T. It is followed by T lines. Each line has three space separated integers, the numbers Z, M and N.
Output
For each test case output a single integer, the number of ways.
Constraints
1 <= T <= 50
1 <= Z <= 10^12
1 <= M <= 10^12
1 <= N <= 10^12
CODE:
#include <iostream>
using namespace std;
int chk_div(long long a,long long b)
{
if(((a/b) * (b) )==a)return 1;
return 0;
}
int main()
{
int t;
long i,j,count;
long n,m,z;
cin>>t;
while(t--)
{count=0;
cin>>z>>m>>n;
if(m>z)m=z;
if(n>z)n=z;
if (m>n)m=n;
for(i=1;i<=m;i++)
{
if(chk_div(z,i))count++;
}
cout<<count<<"\n";
}
return 0;
}
The main problem with performance here is the fact that your inner loop does about 10^12 iterations. You can reduce it a million times to sqrt(z) <= 10^6.
The trick here is to notice that Alice can write z = x * y if and only if he can write z = y * x. Also, either x <= sqrt(z) or y <= sqrt(z). Using these facts you can iterate only up to square root of z to count all cases.
I believe this should get the job done (idea from #zch's answer):
#include <iostream>
#include <cmath>
auto MAX = [] (int A, int B) -> bool { return A > B ? A : B; };
auto MIN = [] (int A, int B) -> bool { return A < B ? A : B; };
using std::cout;
using std::cin;
int main() {
long long Z, M, N, T, low, high, temp, div;
int ans;
for (cin >> T; T--; ) {
cin >> Z >> M >> N;
temp = MIN(M, N);
low = MIN(sqrt(Z), temp);
high = MAX(M, N);
for( ans = 0; low > 0 && (Z / low) <= high; --low ) {
if ( Z % low == 0) {
++ans;
div = Z / low;
ans += (div != low && div <= temp);
}
//cout << temp << " * " << Z / temp << " = " << Z << "\n";
}
cout << ans << "\n";
}
return 0;
}
Will be adding comments in a bit
Code with comments:
#include <iostream>
#include <cmath>
auto MAX = [] (int A, int B) -> bool { return A > B ? A : B; };
auto MIN = [] (int A, int B) -> bool { return A < B ? A : B; };
using std::cout;
using std::cin;
int main() {
long long Z, M, N, T, low, high, temp, div;
int ans;
for (cin >> T; T--; ) {
cin >> Z >> M >> N;
temp = MIN(M, N);
low = MIN(sqrt(Z), temp);//Lowest value <--We start iteration from this number
high = MAX(M, N); //Maximum value
for( ans = 0; low > 0 && (Z / low) <= high; --low ) {
//Number of things going on in this for-loop
//I will start by explaining the condition:
//We want to keep iterating until either low is below 1
// or when the expression (Z / low) > high.
//Notice that as the value of low approaches 0,
//the expression (Z / low) approaches inf
if ( Z % low == 0) {
//If this condition evaluates to true, we know 2 things:
/*Z is divisible by this value of low and
low is in the range of MIN(M,N) <--true*/
/*Because of our condition, (Z / low) is
within the range of MAX(M, N) <--true*/
++ans;
div = Z / low;
//This second part checks if the opposite is true i.e.
/*the value of low is in the range of
MAX(M, N) <--true*/
/*the value (Z / low) is in the range of
MIN(M, N) <--true only in some cases*/
ans += (div != low && div <= temp);
//(div != low) is to avoid double counting
/*An example of this is when Z, M, N have the values:
1000000, 1000000, 1000000
The value of low at the start is 1000 */
}
}
cout << ans << "\n";
}
return 0;
}
In fact, you have to resolve the problem in a different way:
find the Prime decomposition:
so Z = A^a * B^b * ... * P^p with A, B, .., P prime numbers
and so you just have to compute the number of possibilities from a, b, ... p.
(So the result is up to (1 + a) * (1 + b) * ... * (1 + p) depending of M&N constraints).
Your if(((a/b) * (b) ) == a) return 1; will always return 1. Why are you dividing A with B (a/b) then multiply the result by B. This is ambiguous because, your answer will be A. when you say, (a/b) * (b). B`s will cancel each other out and you are left with A as your answer. And so basically you are comparing if A == A, which is true.
How can I implement division using bit-wise operators (not just division by powers of 2)?
Describe it in detail.
The standard way to do division is by implementing binary long-division. This involves subtraction, so as long as you don't discount this as not a bit-wise operation, then this is what you should do. (Note that you can of course implement subtraction, very tediously, using bitwise logical operations.)
In essence, if you're doing Q = N/D:
Align the most-significant ones of N and D.
Compute t = (N - D);.
If (t >= 0), then set the least significant bit of Q to 1, and set N = t.
Left-shift N by 1.
Left-shift Q by 1.
Go to step 2.
Loop for as many output bits (including fractional) as you require, then apply a final shift to undo what you did in Step 1.
Division of two numbers using bitwise operators.
#include <stdio.h>
int remainder, divisor;
int division(int tempdividend, int tempdivisor) {
int quotient = 1;
if (tempdivisor == tempdividend) {
remainder = 0;
return 1;
} else if (tempdividend < tempdivisor) {
remainder = tempdividend;
return 0;
}
do{
tempdivisor = tempdivisor << 1;
quotient = quotient << 1;
} while (tempdivisor <= tempdividend);
/* Call division recursively */
quotient = quotient + division(tempdividend - tempdivisor, divisor);
return quotient;
}
int main() {
int dividend;
printf ("\nEnter the Dividend: ");
scanf("%d", ÷nd);
printf("\nEnter the Divisor: ");
scanf("%d", &divisor);
printf("\n%d / %d: quotient = %d", dividend, divisor, division(dividend, divisor));
printf("\n%d / %d: remainder = %d", dividend, divisor, remainder);
getch();
}
int remainder =0;
int division(int dividend, int divisor)
{
int quotient = 1;
int neg = 1;
if ((dividend>0 &&divisor<0)||(dividend<0 && divisor>0))
neg = -1;
// Convert to positive
unsigned int tempdividend = (dividend < 0) ? -dividend : dividend;
unsigned int tempdivisor = (divisor < 0) ? -divisor : divisor;
if (tempdivisor == tempdividend) {
remainder = 0;
return 1*neg;
}
else if (tempdividend < tempdivisor) {
if (dividend < 0)
remainder = tempdividend*neg;
else
remainder = tempdividend;
return 0;
}
while (tempdivisor<<1 <= tempdividend)
{
tempdivisor = tempdivisor << 1;
quotient = quotient << 1;
}
// Call division recursively
if(dividend < 0)
quotient = quotient*neg + division(-(tempdividend-tempdivisor), divisor);
else
quotient = quotient*neg + division(tempdividend-tempdivisor, divisor);
return quotient;
}
void main()
{
int dividend,divisor;
char ch = 's';
while(ch != 'x')
{
printf ("\nEnter the Dividend: ");
scanf("%d", ÷nd);
printf("\nEnter the Divisor: ");
scanf("%d", &divisor);
printf("\n%d / %d: quotient = %d", dividend, divisor, division(dividend, divisor));
printf("\n%d / %d: remainder = %d", dividend, divisor, remainder);
_getch();
}
}
I assume we are discussing division of integers.
Consider that I got two number 1502 and 30, and I wanted to calculate 1502/30. This is how we do this:
First we align 30 with 1501 at its most significant figure; 30 becomes 3000. And compare 1501 with 3000, 1501 contains 0 of 3000. Then we compare 1501 with 300, it contains 5 of 300, then compare (1501-5*300) with 30. At so at last we got 5*(10^1) = 50 as the result of this division.
Now convert both 1501 and 30 into binary digits. Then instead of multiplying 30 with (10^x) to align it with 1501, we multiplying (30) in 2 base with 2^n to align. And 2^n can be converted into left shift n positions.
Here is the code:
int divide(int a, int b){
if (b != 0)
return;
//To check if a or b are negative.
bool neg = false;
if ((a>0 && b<0)||(a<0 && b>0))
neg = true;
//Convert to positive
unsigned int new_a = (a < 0) ? -a : a;
unsigned int new_b = (b < 0) ? -b : b;
//Check the largest n such that b >= 2^n, and assign the n to n_pwr
int n_pwr = 0;
for (int i = 0; i < 32; i++)
{
if (((1 << i) & new_b) != 0)
n_pwr = i;
}
//So that 'a' could only contain 2^(31-n_pwr) many b's,
//start from here to try the result
unsigned int res = 0;
for (int i = 31 - n_pwr; i >= 0; i--){
if ((new_b << i) <= new_a){
res += (1 << i);
new_a -= (new_b << i);
}
}
return neg ? -res : res;
}
Didn't test it, but you get the idea.
This solution works perfectly.
#include <stdio.h>
int division(int dividend, int divisor, int origdiv, int * remainder)
{
int quotient = 1;
if (dividend == divisor)
{
*remainder = 0;
return 1;
}
else if (dividend < divisor)
{
*remainder = dividend;
return 0;
}
while (divisor <= dividend)
{
divisor = divisor << 1;
quotient = quotient << 1;
}
if (dividend < divisor)
{
divisor >>= 1;
quotient >>= 1;
}
quotient = quotient + division(dividend - divisor, origdiv, origdiv, remainder);
return quotient;
}
int main()
{
int n = 377;
int d = 7;
int rem = 0;
printf("Quotient : %d\n", division(n, d, d, &rem));
printf("Remainder: %d\n", rem);
return 0;
}
Implement division without divison operator:
You will need to include subtraction. But then it is just like you do it by hand (only in the basis of 2). The appended code provides a short function that does exactly this.
uint32_t udiv32(uint32_t n, uint32_t d) {
// n is dividend, d is divisor
// store the result in q: q = n / d
uint32_t q = 0;
// as long as the divisor fits into the remainder there is something to do
while (n >= d) {
uint32_t i = 0, d_t = d;
// determine to which power of two the divisor still fits the dividend
//
// i.e.: we intend to subtract the divisor multiplied by powers of two
// which in turn gives us a one in the binary representation
// of the result
while (n >= (d_t << 1) && ++i)
d_t <<= 1;
// set the corresponding bit in the result
q |= 1 << i;
// subtract the multiple of the divisor to be left with the remainder
n -= d_t;
// repeat until the divisor does not fit into the remainder anymore
}
return q;
}
The below method is the implementation of binary divide considering both numbers are positive. If subtraction is a concern we can implement that as well using binary operators.
Code
-(int)binaryDivide:(int)numerator with:(int)denominator
{
if (numerator == 0 || denominator == 1) {
return numerator;
}
if (denominator == 0) {
#ifdef DEBUG
NSAssert(denominator == 0, #"denominator should be greater then 0");
#endif
return INFINITY;
}
// if (numerator <0) {
// numerator = abs(numerator);
// }
int maxBitDenom = [self getMaxBit:denominator];
int maxBitNumerator = [self getMaxBit:numerator];
int msbNumber = [self getMSB:maxBitDenom ofNumber:numerator];
int qoutient = 0;
int subResult = 0;
int remainingBits = maxBitNumerator-maxBitDenom;
if (msbNumber >= denominator) {
qoutient |=1;
subResult = msbNumber - denominator;
}
else {
subResult = msbNumber;
}
while (remainingBits>0) {
int msbBit = (numerator & (1 << (remainingBits-1)))>0 ? 1 : 0;
subResult = (subResult << 1) |msbBit;
if (subResult >= denominator) {
subResult = subResult-denominator;
qoutient = (qoutient << 1) | 1;
}
else {
qoutient = qoutient << 1;
}
remainingBits--;
}
return qoutient;
}
-(int)getMaxBit:(int)inputNumber
{
int maxBit =0;
BOOL isMaxBitSet = NO;
for (int i=0; i<sizeof(inputNumber)*8; i++) {
if (inputNumber & (1 << i) ) {
maxBit = i;
isMaxBitSet=YES;
}
}
if (isMaxBitSet) {
maxBit += 1;
}
return maxBit;
}
-(int)getMSB:(int)bits ofNumber:(int)number
{
int numbeMaxBit = [self getMaxBit:number];
return number >> (numbeMaxBit -bits);
}
For integers:
public class Division {
public static void main(String[] args) {
System.out.println("Division: " + divide(100, 9));
}
public static int divide(int num, int divisor) {
int sign = 1;
if((num > 0 && divisor < 0) || (num < 0 && divisor > 0))
sign = -1;
return divide(Math.abs(num), Math.abs(divisor), Math.abs(divisor)) * sign;
}
public static int divide(int num, int divisor, int sum) {
if (sum > num) {
return 0;
}
return 1 + divide(num, divisor, sum + divisor);
}
}
With the usual caveats about C's behaviour with shifts, this ought to work for unsigned quantities regardless of the native size of an int...
static unsigned int udiv(unsigned int a, unsigned int b) {
unsigned int c = 1, result = 0;
if (b == 0) return (unsigned int)-1 /*infinity*/;
while (((int)b > 0) && (b < a)) { b = b<<1; c = c<<1; }
do {
if (a >= b) { a -= b; result += c; }
b = b>>1; c = c>>1;
} while (c);
return result;
}
This is my solution to implement division with only bitwise operations:
int align(int a, int b) {
while (b < a) b <<= 1;
return b;
}
int divide(int a, int b) {
int temp = b;
int result = 0;
b = align(a, b);
do {
result <<= 1;
if (a >= b) {
// sub(a,b) is a self-defined bitwise function for a minus b
a = sub(a,b);
result = result | 1;
}
b >>= 1;
} while (b >= temp);
return result;
}
Unsigned Long Division (JavaScript) - based on Wikipedia article: https://en.wikipedia.org/wiki/Division_algorithm:
"Long division is the standard algorithm used for pen-and-paper division of multi-digit numbers expressed in decimal notation. It shifts gradually from the left to the right end of the dividend, subtracting the largest possible multiple of the divisor (at the digit level) at each stage; the multiples then become the digits of the quotient, and the final difference is then the remainder.
When used with a binary radix, this method forms the basis for the (unsigned) integer division with remainder algorithm below."
Function divideWithoutDivision at the end wraps it to allow negative operands. I used it to solve leetcode problem "Product of Array Except Self"
function longDivision(N, D) {
let Q = 0; //quotient and remainder
let R = 0;
let n = mostSignificantBitIn(N);
for (let i = n; i >= 0; i--) {
R = R << 1;
R = setBit(R, 0, getBit(N, i));
if (R >= D) {
R = R - D;
Q = setBit(Q, i, 1);
}
}
//return [Q, R];
return Q;
}
function mostSignificantBitIn(N) {
for (let i = 31; i >= 0; i--) {
if (N & (1 << i))
return i ;
}
return 0;
}
function getBit(N, i) {
return (N & (1 << i)) >> i;
}
function setBit(N, i, value) {
return N | (value << i);
}
function divideWithoutDivision(dividend, divisor) {
let negativeResult = (dividend < 0) ^ (divisor < 0);
dividend = Math.abs(dividend);
divisor = Math.abs(divisor);
let quotient = longDivision(dividend, divisor);
return negativeResult ? -quotient : quotient;
}
All these solutions are too long. The base idea is to write the quotient (for example, 5=101) as 100 + 00 + 1 = 101.
public static Point divide(int a, int b) {
if (a < b)
return new Point(0,a);
if (a == b)
return new Point(1,0);
int q = b;
int c = 1;
while (q<<1 < a) {
q <<= 1;
c <<= 1;
}
Point r = divide(a-q, b);
return new Point(c + r.x, r.y);
}
public static class Point {
int x;
int y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
public int compare(Point b) {
if (b.x - x != 0) {
return x - b.x;
} else {
return y - b.y;
}
}
#Override
public String toString() {
return " (" + x + " " + y + ") ";
}
}
Since bit wise operations work on bits that are either 0 or 1, each bit represents a power of 2, so if I have the bits
1010
that value is 10.
Each bit is a power of two, so if we shift the bits to the right, we divide by 2
1010 --> 0101
0101 is 5
so, in general if you want to divide by some power of 2, you need to shift right by the exponent you raise two to, to get that value
so for instance, to divide by 16, you would shift by 4, as 2^^4 = 16.
Answering to another question, I wrote the program below to compare different search methods in a sorted array. Basically I compared two implementations of Interpolation search and one of binary search. I compared performance by counting cycles spent (with the same set of data) by the different variants.
However I'm sure there is ways to optimize these functions to make them even faster. Does anyone have any ideas on how can I make this search function faster? A solution in C or C++ is acceptable, but I need it to process an array with 100000 elements.
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <stdint.h>
#include <assert.h>
static __inline__ unsigned long long rdtsc(void)
{
unsigned long long int x;
__asm__ volatile (".byte 0x0f, 0x31" : "=A" (x));
return x;
}
int interpolationSearch(int sortedArray[], int toFind, int len) {
// Returns index of toFind in sortedArray, or -1 if not found
int64_t low = 0;
int64_t high = len - 1;
int64_t mid;
int l = sortedArray[low];
int h = sortedArray[high];
while (l <= toFind && h >= toFind) {
mid = low + (int64_t)((int64_t)(high - low)*(int64_t)(toFind - l))/((int64_t)(h-l));
int m = sortedArray[mid];
if (m < toFind) {
l = sortedArray[low = mid + 1];
} else if (m > toFind) {
h = sortedArray[high = mid - 1];
} else {
return mid;
}
}
if (sortedArray[low] == toFind)
return low;
else
return -1; // Not found
}
int interpolationSearch2(int sortedArray[], int toFind, int len) {
// Returns index of toFind in sortedArray, or -1 if not found
int low = 0;
int high = len - 1;
int mid;
int l = sortedArray[low];
int h = sortedArray[high];
while (l <= toFind && h >= toFind) {
mid = low + ((float)(high - low)*(float)(toFind - l))/(1+(float)(h-l));
int m = sortedArray[mid];
if (m < toFind) {
l = sortedArray[low = mid + 1];
} else if (m > toFind) {
h = sortedArray[high = mid - 1];
} else {
return mid;
}
}
if (sortedArray[low] == toFind)
return low;
else
return -1; // Not found
}
int binarySearch(int sortedArray[], int toFind, int len)
{
// Returns index of toFind in sortedArray, or -1 if not found
int low = 0;
int high = len - 1;
int mid;
int l = sortedArray[low];
int h = sortedArray[high];
while (l <= toFind && h >= toFind) {
mid = (low + high)/2;
int m = sortedArray[mid];
if (m < toFind) {
l = sortedArray[low = mid + 1];
} else if (m > toFind) {
h = sortedArray[high = mid - 1];
} else {
return mid;
}
}
if (sortedArray[low] == toFind)
return low;
else
return -1; // Not found
}
int order(const void *p1, const void *p2) { return *(int*)p1-*(int*)p2; }
int main(void) {
int i = 0, j = 0, size = 100000, trials = 10000;
int searched[trials];
srand(-time(0));
for (j=0; j<trials; j++) { searched[j] = rand()%size; }
while (size > 10){
int arr[size];
for (i=0; i<size; i++) { arr[i] = rand()%size; }
qsort(arr,size,sizeof(int),order);
unsigned long long totalcycles_bs = 0;
unsigned long long totalcycles_is_64 = 0;
unsigned long long totalcycles_is_float = 0;
unsigned long long totalcycles_new = 0;
int res_bs, res_is_64, res_is_float, res_new;
for (j=0; j<trials; j++) {
unsigned long long tmp, cycles = rdtsc();
res_bs = binarySearch(arr,searched[j],size);
tmp = rdtsc(); totalcycles_bs += tmp - cycles; cycles = tmp;
res_is_64 = interpolationSearch(arr,searched[j],size);
assert(res_is_64 == res_bs || arr[res_is_64] == searched[j]);
tmp = rdtsc(); totalcycles_is_64 += tmp - cycles; cycles = tmp;
res_is_float = interpolationSearch2(arr,searched[j],size);
assert(res_is_float == res_bs || arr[res_is_float] == searched[j]);
tmp = rdtsc(); totalcycles_is_float += tmp - cycles; cycles = tmp;
}
printf("----------------- size = %10d\n", size);
printf("binary search = %10llu\n", totalcycles_bs);
printf("interpolation uint64_t = %10llu\n", totalcycles_is_64);
printf("interpolation float = %10llu\n", totalcycles_is_float);
printf("new = %10llu\n", totalcycles_new);
printf("\n");
size >>= 1;
}
}
If you have some control over the in-memory layout of the data, you might want to look at Judy arrays.
Or to put a simpler idea out there: a binary search always cuts the search space in half. An optimal cut point can be found with interpolation (the cut point should NOT be the place where the key is expected to be, but the point which minimizes the statistical expectation of the search space for the next step). This minimizes the number of steps but... not all steps have equal cost. Hierarchical memories allow executing a number of tests in the same time as a single test, if locality can be maintained. Since a binary search's first M steps only touch a maximum of 2**M unique elements, storing these together can yield a much better reduction of search space per-cacheline fetch (not per comparison), which is higher performance in the real world.
n-ary trees work on that basis, and then Judy arrays add a few less important optimizations.
Bottom line: even "Random Access Memory" (RAM) is faster when accessed sequentially than randomly. A search algorithm should use that fact to its advantage.
Benchmarked on Win32 Core2 Quad Q6600, gcc v4.3 msys. Compiling with g++ -O3, nothing fancy.
Observation - the asserts, timing and loop overhead is about 40%, so any gains listed below should be divided by 0.6 to get the actual improvement in the algorithms under test.
Simple answers:
On my machine replacing the int64_t with int for "low", "high" and "mid" in interpolationSearch gives a 20% to 40% speed up. This is the fastest easy method I could find. It is taking about 150 cycles per look-up on my machine (for the array size of 100000). That's roughly the same number of cycles as a cache miss. So in real applications, looking after your cache is probably going to be the biggest factor.
Replacing binarySearch's "/2" with a ">>1" gives a 4% speed up.
Using STL's binary_search algorithm, on a vector containing the same data as "arr", is about the same speed as the hand coded binarySearch. Although on the smaller "size"s STL is much slower - around 40%.
I have an excessively complicated solution, which requires a specialized sorting function. The sort is slightly slower than a good quicksort, but all of my tests show that the search function is much faster than a binary or interpolation search. I called it a regression sort before I found out that the name was already taken, but didn't bother to think of a new name (ideas?).
There are three files to demonstrate.
The regression sort/search code:
#include <sstream>
#include <math.h>
#include <ctime>
#include "limits.h"
void insertionSort(int array[], int length) {
int key, j;
for(int i = 1; i < length; i++) {
key = array[i];
j = i - 1;
while (j >= 0 && array[j] > key) {
array[j + 1] = array[j];
--j;
}
array[j + 1] = key;
}
}
class RegressionTable {
public:
RegressionTable(int arr[], int s, int lower, int upper, double mult, int divs);
RegressionTable(int arr[], int s);
void sort(void);
int find(int key);
void printTable(void);
void showSize(void);
private:
void createTable(void);
inline unsigned int resolve(int n);
int * array;
int * table;
int * tableSize;
int size;
int lowerBound;
int upperBound;
int divisions;
int divisionSize;
int newSize;
double multiplier;
};
RegressionTable::RegressionTable(int arr[], int s) {
array = arr;
size = s;
multiplier = 1.35;
divisions = sqrt(size);
upperBound = INT_MIN;
lowerBound = INT_MAX;
for (int i = 0; i < size; ++i) {
if (array[i] > upperBound)
upperBound = array[i];
if (array[i] < lowerBound)
lowerBound = array[i];
}
createTable();
}
RegressionTable::RegressionTable(int arr[], int s, int lower, int upper, double mult, int divs) {
array = arr;
size = s;
lowerBound = lower;
upperBound = upper;
multiplier = mult;
divisions = divs;
createTable();
}
void RegressionTable::showSize(void) {
int bytes = sizeof(*this);
bytes = bytes + sizeof(int) * 2 * (divisions + 1);
}
void RegressionTable::createTable(void) {
divisionSize = size / divisions;
newSize = multiplier * double(size);
table = new int[divisions + 1];
tableSize = new int[divisions + 1];
for (int i = 0; i < divisions; ++i) {
table[i] = 0;
tableSize[i] = 0;
}
for (int i = 0; i < size; ++i) {
++table[((array[i] - lowerBound) / divisionSize) + 1];
}
for (int i = 1; i <= divisions; ++i) {
table[i] += table[i - 1];
}
table[0] = 0;
for (int i = 0; i < divisions; ++i) {
tableSize[i] = table[i + 1] - table[i];
}
}
int RegressionTable::find(int key) {
double temp = multiplier;
multiplier = 1;
int minIndex = table[(key - lowerBound) / divisionSize];
int maxIndex = minIndex + tableSize[key / divisionSize];
int guess = resolve(key);
double t;
while (array[guess] != key) {
// uncomment this line if you want to see where it is searching.
//cout << "Regression Guessing " << guess << ", not there." << endl;
if (array[guess] < key) {
minIndex = guess + 1;
}
if (array[guess] > key) {
maxIndex = guess - 1;
}
if (array[minIndex] > key || array[maxIndex] < key) {
return -1;
}
t = ((double)key - array[minIndex]) / ((double)array[maxIndex] - array[minIndex]);
guess = minIndex + t * (maxIndex - minIndex);
}
multiplier = temp;
return guess;
}
inline unsigned int RegressionTable::resolve(int n) {
float temp;
int subDomain = (n - lowerBound) / divisionSize;
temp = n % divisionSize;
temp /= divisionSize;
temp *= tableSize[subDomain];
temp += table[subDomain];
temp *= multiplier;
return (unsigned int)temp;
}
void RegressionTable::sort(void) {
int * out = new int[int(size * multiplier)];
bool * used = new bool[int(size * multiplier)];
int higher, lower;
bool placed;
for (int i = 0; i < size; ++i) {
/* Figure out where to put the darn thing */
higher = resolve(array[i]);
lower = higher - 1;
if (higher > newSize) {
higher = size;
lower = size - 1;
} else if (lower < 0) {
higher = 0;
lower = 0;
}
placed = false;
while (!placed) {
if (higher < size && !used[higher]) {
out[higher] = array[i];
used[higher] = true;
placed = true;
} else if (lower >= 0 && !used[lower]) {
out[lower] = array[i];
used[lower] = true;
placed = true;
}
--lower;
++higher;
}
}
int index = 0;
for (int i = 0; i < size * multiplier; ++i) {
if (used[i]) {
array[index] = out[i];
++index;
}
}
insertionSort(array, size);
}
And then there is the regular search functions:
#include <iostream>
using namespace std;
int binarySearch(int array[], int start, int end, int key) {
// Determine the search point.
int searchPos = (start + end) / 2;
// If we crossed over our bounds or met in the middle, then it is not here.
if (start >= end)
return -1;
// Search the bottom half of the array if the query is smaller.
if (array[searchPos] > key)
return binarySearch (array, start, searchPos - 1, key);
// Search the top half of the array if the query is larger.
if (array[searchPos] < key)
return binarySearch (array, searchPos + 1, end, key);
// If we found it then we are done.
if (array[searchPos] == key)
return searchPos;
}
int binarySearch(int array[], int size, int key) {
return binarySearch(array, 0, size - 1, key);
}
int interpolationSearch(int array[], int size, int key) {
int guess = 0;
double t;
int minIndex = 0;
int maxIndex = size - 1;
while (array[guess] != key) {
t = ((double)key - array[minIndex]) / ((double)array[maxIndex] - array[minIndex]);
guess = minIndex + t * (maxIndex - minIndex);
if (array[guess] < key) {
minIndex = guess + 1;
}
if (array[guess] > key) {
maxIndex = guess - 1;
}
if (array[minIndex] > key || array[maxIndex] < key) {
return -1;
}
}
return guess;
}
And then I wrote a simple main to test out the different sorts.
#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <ctime>
#include "regression.h"
#include "search.h"
using namespace std;
void randomizeArray(int array[], int size) {
for (int i = 0; i < size; ++i) {
array[i] = rand() % size;
}
}
int main(int argc, char * argv[]) {
int size = 100000;
string arg;
if (argc > 1) {
arg = argv[1];
size = atoi(arg.c_str());
}
srand(time(NULL));
int * array;
cout << "Creating Array Of Size " << size << "...\n";
array = new int[size];
randomizeArray(array, size);
cout << "Sorting Array...\n";
RegressionTable t(array, size, 0, size*2.5, 1.5, size);
//RegressionTable t(array, size);
t.sort();
int trials = 10000000;
int start;
cout << "Binary Search...\n";
start = clock();
for (int i = 0; i < trials; ++i) {
binarySearch(array, size, i % size);
}
cout << clock() - start << endl;
cout << "Interpolation Search...\n";
start = clock();
for (int i = 0; i < trials; ++i) {
interpolationSearch(array, size, i % size);
}
cout << clock() - start << endl;
cout << "Regression Search...\n";
start = clock();
for (int i = 0; i < trials; ++i) {
t.find(i % size);
}
cout << clock() - start << endl;
return 0;
}
Give it a try and tell me if it's faster for you. It's super complicated, so it's really easy to break it if you don't know what you are doing. Be careful about modifying it.
I compiled the main with g++ on ubuntu.
Unless your data is known to have special properties, pure interpolation search has the risk of taking linear time. If you expect interpolation to help with most data but don't want it to hurt in the case of pathological data, I would use a (possibly weighted) average of the interpolated guess and the midpoint, ensuring a logarithmic bound on the run time.
One way of approaching this is to use a space versus time trade-off. There are any number of ways that could be done. The extreme way would be to simply make an array with the max size being the max value of the sorted array. Initialize each position with the index into sortedArray. Then the search would simply be O(1).
The following version, however, might be a little more realistic and possibly be useful in the real world. It uses a "helper" structure that is initialized on the first call. It maps the search space down to a smaller space by dividing by a number that I pulled out of the air without much testing. It stores the index of the lower bound for a group of values in sortedArray into the helper map. The actual search divides the toFind number by the chosen divisor and extracts the narrowed bounds of sortedArray for a normal binary search.
For example, if the sorted values range from 1 to 1000 and the divisor is 100, then the lookup array might contain 10 "sections". To search for value 250, it would divide it by 100 to yield integer index position 250/100=2. map[2] would contain the sortedArray index for values 200 and larger. map[3] would have the index position of values 300 and larger thus providing a smaller bounding position for a normal binary search. The rest of the function is then an exact copy of your binary search function.
The initialization of the helper map might be more efficient by using a binary search to fill in the positions rather than a simple scan, but it is a one time cost so I didn't bother testing that. This mechanism works well for the given test numbers which are evenly distributed. As written, it would not be as good if the distribution was not even. I think this method could be used with floating point search values too. However, extrapolating it to generic search keys might be harder. For example, I am unsure what the method would be for character data keys. It would need some kind of O(1) lookup/hash that mapped to a specific array position to find the index bounds. It's unclear to me at the moment what that function would be or if it exists.
I kludged the setup of the helper map in the following implementation pretty quickly. It is not pretty and I'm not 100% sure it is correct in all cases but it does show the idea. I ran it with a debug test to compare the results against your existing binarySearch function to be somewhat sure it works correctly.
The following are example numbers:
100000 * 10000 : cycles binary search = 10197811
100000 * 10000 : cycles interpolation uint64_t = 9007939
100000 * 10000 : cycles interpolation float = 8386879
100000 * 10000 : cycles binary w/helper = 6462534
Here is the quick-and-dirty implementation:
#define REDUCTION 100 // pulled out of the air
typedef struct {
int init; // have we initialized it?
int numSections;
int *map;
int divisor;
} binhelp;
int binarySearchHelp( binhelp *phelp, int sortedArray[], int toFind, int len)
{
// Returns index of toFind in sortedArray, or -1 if not found
int low;
int high;
int mid;
if ( !phelp->init && len > REDUCTION ) {
int i;
int numSections = len / REDUCTION;
int divisor = (( sortedArray[len-1] - 1 ) / numSections ) + 1;
int threshold;
int arrayPos;
phelp->init = 1;
phelp->divisor = divisor;
phelp->numSections = numSections;
phelp->map = (int*)malloc((numSections+2) * sizeof(int));
phelp->map[0] = 0;
phelp->map[numSections+1] = len-1;
arrayPos = 0;
// Scan through the array and set up the mapping positions. Simple linear
// scan but it is a one-time cost.
for ( i = 1; i <= numSections; i++ ) {
threshold = i * divisor;
while ( arrayPos < len && sortedArray[arrayPos] < threshold )
arrayPos++;
if ( arrayPos < len )
phelp->map[i] = arrayPos;
else
// kludge to take care of aliasing
phelp->map[i] = len - 1;
}
}
if ( phelp->init ) {
int section = toFind / phelp->divisor;
if ( section > phelp->numSections )
// it is bigger than all values
return -1;
low = phelp->map[section];
if ( section == phelp->numSections )
high = len - 1;
else
high = phelp->map[section+1];
} else {
// use normal start points
low = 0;
high = len - 1;
}
// the following is a direct copy of the Kriss' binarySearch
int l = sortedArray[low];
int h = sortedArray[high];
while (l <= toFind && h >= toFind) {
mid = (low + high)/2;
int m = sortedArray[mid];
if (m < toFind) {
l = sortedArray[low = mid + 1];
} else if (m > toFind) {
h = sortedArray[high = mid - 1];
} else {
return mid;
}
}
if (sortedArray[low] == toFind)
return low;
else
return -1; // Not found
}
The helper structure needs to be initialized (and memory freed):
help.init = 0;
unsigned long long totalcycles4 = 0;
... make the calls same as for the other ones but pass the structure ...
binarySearchHelp(&help, arr,searched[j],length);
if ( help.init )
free( help.map );
help.init = 0;
Look first at the data and whether a big gain can be got by data specific method over a general method.
For large static sorted datasets, you can create an additional index to provide partial pigeon holing, based on the amount of memory you're willing to use. e.g. say we create a 256x256 two dimensional array of ranges, which we populate with the start and end positions in the search array of elements with corresponding high order bytes. When we come to search, we then use the high order bytes on the key to find the range / subset of the array we need to search. If we did have ~ 20 comparisons on our binary search of 100,000 elements O(log2(n)) we're now down to ~4 comarisons for 16 elements, or O(log2 (n/15)). The memory cost here is about 512k
Another method, again suited to data that doesn't change much, is to divide the data into arrays of commonly sought items and rarely sought items. For example, if you leave your existing search in place running a wide number of real world cases over a protracted testing period, and log the details of the item being sought, you may well find that the distribution is very uneven, i.e. some values are sought far more regularly than others. If this is the case, break your array into a much smaller array of commonly sought values and a larger remaining array, and search the smaller array first. If the data is right (big if!), you can often achieve broadly similar improvements to the first solution without the memory cost.
There are many other data specific optimizations which score far better than trying to improve on tried, tested and far more widely used general solutions.
Posting my current version before the question is closed (hopefully I will thus be able to ehance it later). For now it is worse than every other versions (if someone understand why my changes to the end of loop has this effect, comments are welcome).
int newSearch(int sortedArray[], int toFind, int len)
{
// Returns index of toFind in sortedArray, or -1 if not found
int low = 0;
int high = len - 1;
int mid;
int l = sortedArray[low];
int h = sortedArray[high];
while (l < toFind && h > toFind) {
mid = low + ((float)(high - low)*(float)(toFind - l))/(1+(float)(h-l));
int m = sortedArray[mid];
if (m < toFind) {
l = sortedArray[low = mid + 1];
} else if (m > toFind) {
h = sortedArray[high = mid - 1];
} else {
return mid;
}
}
if (l == toFind)
return low;
else if (h == toFind)
return high;
else
return -1; // Not found
}
The implementation of the binary search that was used for comparisons can be improved. The key idea is to "normalize" the range initially so that the target is always > a minimum and < than a maximum after the first step. This increases the termination delta size. It also has the effect of special casing targets that are less than the first element of the sorted array or greater than the last element of the sorted array. Expect approximately a 15% improvement in search time. Here is what the code might look like in C++.
int binarySearch(int * &array, int target, int min, int max)
{ // binarySearch
// normalize min and max so that we know the target is > min and < max
if (target <= array[min]) // if min not normalized
{ // target <= array[min]
if (target == array[min]) return min;
return -1;
} // end target <= array[min]
// min is now normalized
if (target >= array[max]) // if max not normalized
{ // target >= array[max]
if (target == array[max]) return max;
return -1;
} // end target >= array[max]
// max is now normalized
while (min + 1 < max)
{ // delta >=2
int tempi = min + ((max - min) >> 1); // point to index approximately in the middle between min and max
int atempi = array[tempi]; // just in case the compiler does not optimize this
if (atempi > target)max = tempi; // if the target is smaller, we can decrease max and it is still normalized
else if (atempi < target)min = tempi; // the target is bigger, so we can increase min and it is still normalized
else return tempi; // if we found the target, return with the index
// Note that it is important that this test for equality is last because it rarely occurs.
} // end delta >=2
return -1; // nothing in between normalized min and max
} // end binarySearch