Binary Search Index Questions [duplicate]

Binary Search Index Questions [duplicate] - c++

This question already has answers here:
Binary Search algorithm implementations
(2 answers)
Closed 2 years ago.
When performing a binary search on answer, I've seen different forms of the following:
loop condition : (low+1<hi), (low<hi), (low<=hi) updating indices: (hi=mid+1), (hi=mid), (low=mid), (low=mid-1)
What is the difference between these and do they actually matter?

Each of the loop conditions simply state when the loop will end. If you want to find exactly one element lo < hi is usually the easiest method. For two elements, or lo + 1 < hi could be used. lo <= hi is usually paired with an early return statement in the while loop.
Before updating the indices, a mid is chosen usually either (lo + hi) / 2 or (lo + hi + 1) / 2 (ignoring integer overflow). The difference between these is that the first has a bias towards lo if there are an even number of elements between lo and hi, whereas the second has a bias towards hi.
The updating indices have + 1 attached to them to ensure that there is no infinite loop. In general, you want to make sure lo and hi are modified by at least 1 for every iteration of the loop.
For reference, here is my preferred way of doing binary search:
int binary_search(std::vector<int> nums, int target) {
if (nums.empty())
return -1;
int l = 0;
int h = nums.size() - 1;
while (l < h) {
// If the language doesn't have big ints, make sure there is no overflow.
// This has a left bias if there are an even number of elements left.
int m = l + (h - l) / 2;
if (nums[m] < target) {
// The `+ 1` here is important. Without this, if there are two elements
// and nums[0] < target, we'll get an infinite loop.
l = m + 1;
} else {
// Since `m < h`, we "make progress" in this case.
h = m;
}
}
return nums[l] == target ? l : -1;
}
I like this method, because it is clear that there is no infinite loop, and the exit condition does not rely on early return statements.

Related

Why M = L + ((R - L) / 2) instead of M=(L+R)/2 avoid overflow in C++?

Hello I was looking at the C++ solution to the question "Suppose a sorted array is rotated at some pivot unknown to you beforehand. (i.e., 0 1 2 4 5 6 7 might become 4 5 6 7 0 1 2). How do you find an element in the rotated array efficiently? You may assume no duplicate exists in the array."
int rotated_binary_search(int A[], int N, int key) {
int L = 0;
int R = N - 1;
while (L <= R) {
// Avoid overflow, same as M=(L+R)/2
int M = L + ((R - L) / 2);
if (A[M] == key) return M;
// the bottom half is sorted
if (A[L] <= A[M]) {
if (A[L] <= key && key < A[M])
R = M - 1;
else
L = M + 1;
}
// the upper half is sorted
else {
if (A[M] < key && key <= A[R])
L = M + 1;
else
R = M - 1;
}
}
return -1;
}
and saw the comment says that using M = L + ((R - L) / 2) instead of M=(L+R)/2 avoid overflow. Why is that? Thx ahead

Because it does...
Let's assume for a minute you're using unsigned chars (same applies to larger integers of course).
If L is 100 and R is 200, the first version is:
M = (100 + 200) / 2 = 300 / 2 = 22
100+200 overflows (because the largest unsigned char is 255), and you get 100+200=44 (unsigned no. addition).
The second, on the other hand:
M = 100 + (200-100) / 2 = 100 + 100 / 2 = 150
No overflow.
As #user2357112 pointed out in a comment, there are no free lunches. If L is negative, the second version might not work while the first will.

Not sure, but if the max limit of int is suppose 100.
R=80 & L = 40
then,
M=(L+R)/2
M=(120)/2, here 120 is out limits if our integer type, so this causes overflow
However,
M = L + ((R - L) / 2)
M = 80 +((40)/2)
M = 80 +20
M =100.
So in this case we never encounter a value that exceeds the limits of our integer type.So this approach will never encounter a overFlow, THEORATICALLY.
I hope this analogy will help

It avoids overflow in this specific implementation, which operates under the guarantees that L and R are non-negative and L <= R. Under these guarantees it should be obvious that R - L does not overflow and L + ((R - L) / 2) does not overflow either.
In general case (i.e. for arbitrary values of L and R) R - L is as prone to overflow as L + R, meaning that this trick does not achieve anything.

The comment is wrong, for a number of reasons.
For the particular problem the risk of overflow is probably nil.
Reordering calculations does not guarantee that the compiler will perform them in that order.
If there is a range of values for which an ordering can cause overflow, then there is another range of values for which the reordered calculation will cause overflow.
If overflow could be a problem then it should be controlled explicitly, not implicitly.
This is an excellent place for an assert. In this case the algorithm is only valid if N is less than half the maximum positive range of int, so say it in an assert.
If the algorithm is required to work for the whole positive range of signed int then the range should be explicitly tested in an assert, and the calculation should be ordered by introducing a sequence point (eg broken into two statements).
Doing this right is hard. Numerical computation is full of this stuff. Best to avoid, if possible. And don't accept random advice (even this!) without doing your own research.

Is there an expression using modulo to do backwards wrap-around ("reverse overflow")?

For any whole number input W restricted by the range R = [x,y], the "overflow," for lack of a better term, of W over R is W % (y-x+1) + x. This causes it wrap back around if W exceeds y.
As an example of this principle, suppose we iterate over a calendar's months:
int this_month = 5;
int next_month = (this_month + 1) % 12;
where both integers will be between 0 and 11, inclusive. Thus, the expression above "clamps" the integer to the range R = [0,11]. This approach of using an expression is simple, elegant, and advantageous as it omits branching.
Now, what if we want to do the same thing, but backwards? The following expression works:
int last_month = ((this_month - 1) % 12 + 12) % 12;
but it's abstruse. How can it be beautified?
tl;dr - Can the expression ((x-1) % k + k) % k be simplified further?
Note: C++ tag specified because other languages handle negative operands for the modulo operator differently.

Your expression should be ((x-1) + k) % k. This will properly wrap x=0 around to 11. In general, if you want to step back more than 1, you need to make sure that you add enough so that the first operand of the modulo operation is >= 0.
Here is an implementation in C++:
int wrapAround(int v, int delta, int minval, int maxval)
{
const int mod = maxval + 1 - minval;
if (delta >= 0) {return (v + delta - minval) % mod + minval;}
else {return ((v + delta) - delta * mod - minval) % mod + minval;}
}
This also allows to use months labeled from 0 to 11 or from 1 to 12, setting min_val and max_val accordingly.
Since this answer is so highly appreciated, here is an improved version without branching, which also handles the case where the initial value v is smaller than minval. I keep the other example because it is easier to understand:
int wrapAround(int v, int delta, int minval, int maxval)
{
const int mod = maxval + 1 - minval;
v += delta - minval;
v += (1 - v / mod) * mod;
return v % mod + minval;
}
The only issue remaining is if minval is larger than maxval. Feel free to add an assertion if you need it.

k % k will always be 0. I'm not 100% sure what you're trying to do but it seems you want the last month to be clamped between 0 and 11 inclusive.
(this_month + 11) % 12
Should suffice.

The general solution is to write a function that computes the value that you want:
//Returns floor(a/n) (with the division done exactly).
//Let ÷ be mathematical division, and / be C++ division.
//We know
// a÷b = a/b + f (f is the remainder, not all
// divisions have exact Integral results)
//and
// (a/b)*b + a%b == a (from the standard).
//Together, these imply (through algebraic manipulation):
// sign(f) == sign(a%b)*sign(b)
//We want the remainder (f) to always be >=0 (by definition of flooredDivision),
//so when sign(f) < 0, we subtract 1 from a/n to make f > 0.
template<typename Integral>
Integral flooredDivision(Integral a, Integral n) {
Integral q(a/n);
if ((a%n < 0 && n > 0) || (a%n > 0 && n < 0)) --q;
return q;
}
//flooredModulo: Modulo function for use in the construction
//looping topologies. The result will always be between 0 and the
//denominator, and will loop in a natural fashion (rather than swapping
//the looping direction over the zero point (as in C++11),
//or being unspecified (as in earlier C++)).
//Returns x such that:
//
//Real a = Real(numerator)
//Real n = Real(denominator)
//Real r = a - n*floor(n/d)
//x = Integral(r)
template<typename Integral>
Integral flooredModulo(Integral a, Integral n) {
return a - n * flooredDivision(a, n);
}

Easy Peasy, do not use the first module operator, it is superfluous:
int last_month = (this_month - 1 + 12) % 12;
which is the general case
In this instance you can write 11, but I would still do the -1 + 11 as it more clearly states what you want to achieve.

Note that normal mod causes the pattern 0...11 to repeat at 12...23, 24...35, etc. but doesn't wrap on -11...-1. In other words, it has two sets of behaviors. One from -infinity...-1, and a different set of behavior from 0...infinity.
The expression ((x-1) % k + k) % k fixes -11...-1 but has the same problem as normal mod with -23...-12. I.e. while it fixes 12 additional numbers, it doesn't wrap around infinitely. It still has one set of behavior from -infinity...-12, and a different behavior from -11...+infinity.
This means that if you're using the function for offsets, it could lead to buggy code.
If you want a truly wrap around mod, it should handle the entire range, -infinity...infinity in exactly the same way.
There is probably a better way to implement this, but here is an easy to understand implementation:
// n must be greater than 0
func wrapAroundMod(a: Int, n: Int) -> Int {
var offsetTimes: Int = 0
if a < 0 {
offsetTimes = (-a / n) + 1
}
return (a + n * offsetTimes) % n
}

Not sure if you were having the same problem as me, but my problem was essentially that I wanted to constrain all numbers to a certain range. Say that range was 0-6, so using %7 means that any number higher than 6 will wrap back around to 0 or above. The actual problem is that numbers less than zero didn't wrap back around to 6. I have a solution to that (where X is the upper limit of your number range and 0 is the minimum):
if(inputNumber <0)//If this is a negative number
{
(X-(inputNumber*-1))%X;
}
else
{
inputNumber%X;
}

Find the lowest integer that matches equation in C++ [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Finding Hardy Ramanujan Numbers
I need to find the lowest natural number x where
x = k^3 + l^3 = i^3 + j^3
and (k, l, i, j) must all be different.
I tried the following four for loops, but I couldn't get it to the right solution because of infinitely increasing variables...
for (int i=0;;i++)
for (int j=i+1;;j++)
for (int k=j+1;;k++)
for (int l=k+1;;i++)
compare(i,j,k,l);

You need to reframe how you're thinking about the problem.
It's really saying this: what's the smallest natural number expressible as the sum of two cubes in two different ways?
The problem statement calls that number x, and the pairs of cubes are (i, j) and (k, l).
Restated in this way, it's not nearly so bad. Here's a hint in pseudocode:
function count_num_cubic_pairs(n):
cubic_pairs = []
for i..n:
first_cube = i * i * i
remainder = n - first_cube
if remainder is a cube and (first_cube, remainder) not in cubic_pairs:
cubic_pairs.add((first_cube, remainder))
return length(cubic_pairs)
The tough part will be testing whether remainder is a cube - floating point errors will complicate that a lot. That's the real meat of this problem - have fun with it.

One easy way to make your code work is to limit the domain of your variables, and then expand it a bit at a time.
As mazayus mentioned, you're keeping each variable strictly greater than the previous ones, so you never any variation that could possibly be correct.
Something like this may work (pseudocode) but it's horribly inefficient:
for max in [100, 200, 300, ...]
for i in [0..max]
for j in [0..max]
for k in [0..max]
for l in [0..max]
if (i equals k or l, or j equals k or l) continue
if (i^3 + j^3 equals k^3 + l^3)
return answer

int i = 1
int j = 3
int k = 2
int l = 4
do {
do {
do {
do {
compare(i, j ,k l);
i++;
} while (i < k);
k++;
} while (k < j);
j++;
} while(j < l);
l++;
} while(l < 100);
Something like this tries every combination of numbers without dups (up to values of 100), with i < k < j < l.

Your loops assume i<j<k<l, which is not necessarily true. (It might be that j>k.) Once you get the right assumptions, you can reorder you loops so the first item is biggest and so the other loops are limited.
Here's an example with the i>j, i>k>l,
for (int i=1;;i++)
for (int j=1;j<i;j++)
for (int k=1;k<i;k++)
for (int l=1;l<k;i++)
compare(i,j,k,l);
Once you get that working, try eliminating the fourth loop by checking if the cube root of i*i*i+j*j*j-k*k*k is a natural number. Then try finding a smarter starting value for k.

Finding all paths down stairs?

I was given the following problem in an interview:
Given a staircase with N steps, you can go up with 1 or 2 steps each time. Output all possible way you go from bottom to top.
For example:
N = 3
Output :
1 1 1
1 2
2 1
When interviewing, I just said to use dynamic programming.
S(n) = S(n-1) +1 or S(n) = S(n-1) +2
However, during the interview, I didn't write very good code for this. How would you code up a solution to this problem?
Thanks indeed!

I won't write the code for you (since it's a great exercise), but this is a classic dynamic programming problem. You're on the right track with the recurrence; it's true that
S(0) = 1
Since if you're at the bottom of the stairs there's exactly one way to do this. We also have that
S(1) = 1
Because if you're one step high, your only option is to take a single step down, at which point you're at the bottom.
From there, the recurrence for the number of solutions is easy to find. If you think about it, any sequence of steps you take either ends with taking one small step as your last step or one large step as your last step. In the first case, each of the S(n - 1) solutions for n - 1 stairs can be extended into a solution by taking one more step, while in the second case each of the S(n - 2) solutions to the n - 2 stairs case can be extended into a solution by taking two steps. This gives the recurrence
S(n) = S(n - 2) + S(n - 1)
Notice that to evaluate S(n), you only need access to S(n - 2) and S(n - 1). This means that you could solve this with dynamic programming using the following logic:
Create an array S with n + 1 elements in it, indexed by 0, 1, 2, ..., n.
Set S[0] = S[1] = 1
For i from 2 to n, inclusive, set S[i] = S[i - 1] + S[i - 2].
Return S[n].
The runtime for this algorithm is a beautiful O(n) with O(n) memory usage.
However, it's possible to do much better than this. In particular, let's take a look at the first few terms of the sequence, which are
S(0) = 1
S(1) = 1
S(2) = 2
S(3) = 3
S(4) = 5
This looks a lot like the Fibonacci sequence, and in fact you might be able to see that
S(0) = F(1)
S(1) = F(2)
S(2) = F(3)
S(3) = F(4)
S(4) = F(5)
This suggests that, in general, S(n) = F(n + 1). We can actually prove this by induction on n as follows.
As our base cases, we have that
S(0) = 1 = F(1) = F(0 + 1)
and
S(1) = 1 = F(2) = F(1 + 1)
For the inductive step, we get that
S(n) = S(n - 2) + S(n - 1) = F(n - 1) + F(n) = F(n + 1)
And voila! We've gotten this series written in terms of Fibonacci numbers. This is great, because it's possible to compute the Fibonacci numbers in O(1) space and O(lg n) time. There are many ways to do this. One uses the fact that
F(n) = (1 / √(5)) (Φn + φn)
Here, Φ is the golden ratio, (1 + √5) / 2 (about 1.6), and φ is 1 - Φ, about -0.6. Because this second term drops to zero very quickly, you can get a the nth Fibonacci number by computing
(1 / √(5)) Φn
And rounding down. Moreover, you can compute Φn in O(lg n) time by repeated squaring. The idea is that we can use this cool recurrence:
x0 = 1
x2n = xn * xn
x2n + 1 = x * xn * xn
You can show using a quick inductive argument that this terminates in O(lg n) time, which means that you can solve this problem using O(1) space and O(lg n) time, which is substantially better than the DP solution.
Hope this helps!

You can generalize your recursive function to also take already made moves.
void steps(n, alreadyTakenSteps) {
if (n == 0) {
print already taken steps
}
if (n >= 1) {
steps(n - 1, alreadyTakenSteps.append(1));
}
if (n >= 2) {
steps(n - 2, alreadyTakenSteps.append(2));
}
}
It's not really the code, more of a pseudocode, but it should give you an idea.

Your solution sounds right.
S(n):
If n = 1 return {1}
If n = 2 return {2, (1,1)}
Return S(n-1)x{1} U S(n-2)x{2}
(U is Union, x is Cartesian Product)
Memoizing this is trivial, and would make it O(Fib(n)).

Great answer by #templatetypedef - I did this problem as an exercise and arrived at the Fibonacci numbers on a different route:
The problem can basically be reduced to an application of Binomial coefficients which are handy for Combination problems: The number of combinations of n things taken k at a time (called n choose k) can be found by the equation
Given that and the problem at hand you can calculate a solution brute force (just doing the combination count). The number of "take 2 steps" must be zero at least and may be 50 at most, so the number of combinations is the sum of C(n,k) for 0 <= k <= 50 ( n= number of decisions to be made, k = number of 2's taken out of those n)
BigInteger combinationCount = 0;
for (int k = 0; k <= 50; k++)
{
int n = 100 - k;
BigInteger result = Fact(n) / (Fact(k) * Fact(n - k));
combinationCount += result;
}
The sum of these binomial coefficients just happens to also have a different formula:

Actually, you can prove that the number of ways to climb is just the fibonacci sequence. Good explanation here: http://theory.cs.uvic.ca/amof/e_fiboI.htm

Solving the problem, and solving it using a dynamic programming solution are potentially two different things.
http://en.wikipedia.org/wiki/Dynamic_programming
In general, to solve a given problem, we need to solve different parts of the problem (subproblems), then combine the solutions of the subproblems to reach an overall solution. Often, many of these subproblems are really the same. The dynamic programming approach seeks to solve each subproblem only once, thus reducing the number of computations
This leads me to believe you want to look for a solution that is both Recursive, and uses the Memo Design Pattern. Recursion solves a problem by breaking it into sub-problems, and the Memo design pattern allows you to cache answers, thus avoiding re-calculation. (Note that there are probably cache implementations that aren't the Memo design pattern, and you could use one of those as well).
Solving:
The first step I would take would be to solve some set of problems by hand, with varying or increasing sizes of N. This will give you a pattern to help you figure out a solution. Start with N = 1, through N = 5. (as others have stated, it may be a form of the fibbonacci sequence, but I would determine this for myself before calling the problem solved and understood).
From there, I would try to make a generalized solution that used recursion. Recursion solves a problem by breaking it into sub-problems.
From there, I would try to make a cache of previous problem inputs to the corresponding output, hence memoizing it, and making a solution that involved "Dynamic Programming".
I.e., maybe the inputs to one of your functions are 2, 5, and the correct result was 7. Make some function that looks this up from an existing list or dictionary (based on the input). It will look for a call that was made with the inputs 2, 5. If it doesn't find it, call the function to calculate it, then store it and return the answer (7). If it does find it, don't bother calculating it, and return the previously calculated answer.

Here is a simple solution to this question in very simple CSharp (I believe you can port this with almost no change to Java/C++).
I have added a little bit more of complexity to it (adding the possibility that you can also walk 3 steps). You can even generalize this code to "from 1 to k-steps" if desired with a while loop in the addition of steps (last if statement).
I have used a combination of both dynamic programming and recursion. The use of dynamic programming avoid the recalculation of each previous step; reducing the space and time complexity related to the call stack. It however adds some space complexity (O(maxSteps)) which I think is negligible compare to the gain.
/// <summary>
/// Given a staircase with N steps, you can go up with 1 or 2 or 3 steps each time.
/// Output all possible way you go from bottom to top
/// </summary>
public class NStepsHop
{
const int maxSteps = 500; // this is arbitrary
static long[] HistorySumSteps = new long[maxSteps];
public static long CountWays(int n)
{
if (n >= 0 && HistorySumSteps[n] != 0)
{
return HistorySumSteps[n];
}
long currentSteps = 0;
if (n < 0)
{
return 0;
}
else if (n == 0)
{
currentSteps = 1;
}
else
{
currentSteps = CountWays(n - 1) +
CountWays(n - 2) +
CountWays(n - 3);
}
HistorySumSteps[n] = currentSteps;
return currentSteps;
}
}
You can call it in the following manner
long result;
result = NStepsHop.CountWays(0); // result = 1
result = NStepsHop.CountWays(1); // result = 1
result = NStepsHop.CountWays(5); // result = 13
result = NStepsHop.CountWays(10); // result = 274
result = NStepsHop.CountWays(25); // result = 2555757
You can argue that the initial case when n = 0, it could 0, instead of 1. I decided to go for 1, however modifying this assumption is trivial.

the problem can be solved quite nicely using recursion:
void printSteps(int n)
{
char* output = new char[n+1];
generatePath(n, output, 0);
printf("\n");
}
void generatePath(int n, char* out, int recLvl)
{
if (n==0)
{
out[recLvl] = '\0';
printf("%s\n",out);
}
if(n>=1)
{
out[recLvl] = '1';
generatePath(n-1,out,recLvl+1);
}
if(n>=2)
{
out[recLvl] = '2';
generatePath(n-2,out,recLvl+1);
}
}
and in main:
void main()
{
printSteps(0);
printSteps(3);
printSteps(4);
return 0;
}

It's a weighted graph problem.
From 0 you can get to 1 only 1 way (0-1).
You can get to 2 two ways, from 0 and from 1 (0-2, 1-1).
You can get to 3 three ways, from 1 and from 2 (2 has two ways).
You can get to 4 five ways, from 2 and from 3 (2 has two ways and 3 has three ways).
You can get to 5 eight ways, ...
A recursive function should be able to handle this, working backwards from N.

Complete C-Sharp code for this
void PrintAllWays(int n, string str)
{
string str1 = str;
StringBuilder sb = new StringBuilder(str1);
if (n == 0)
{
Console.WriteLine(str1);
return;
}
if (n >= 1)
{
sb = new StringBuilder(str1);
PrintAllWays(n - 1, sb.Append("1").ToString());
}
if (n >= 2)
{
sb = new StringBuilder(str1);
PrintAllWays(n - 2, sb.Append("2").ToString());
}
}

Late C-based answer
#include <stdio.h>
#include <stdlib.h>
#define steps 60
static long long unsigned int MAP[steps + 1] = {1 , 1 , 2 , 0,};
static long long unsigned int countPossibilities(unsigned int n) {
if (!MAP[n]) {
MAP[n] = countPossibilities(n-1) + countPossibilities(n-2);
}
return MAP[n];
}
int main() {
printf("%llu",countPossibilities(steps));
}

Here is a C++ solution. This prints all possible paths for a given number of stairs.
// Utility function to print a Vector of Vectors
void printVecOfVec(vector< vector<unsigned int> > vecOfVec)
{
for (unsigned int i = 0; i < vecOfVec.size(); i++)
{
for (unsigned int j = 0; j < vecOfVec[i].size(); j++)
{
cout << vecOfVec[i][j] << " ";
}
cout << endl;
}
cout << endl;
}
// Given a source vector and a number, it appends the number to each source vectors
// and puts the final values in the destination vector
void appendElementToVector(vector< vector <unsigned int> > src,
unsigned int num,
vector< vector <unsigned int> > &dest)
{
for (int i = 0; i < src.size(); i++)
{
src[i].push_back(num);
dest.push_back(src[i]);
}
}
// Ladder Problem
void ladderDynamic(int number)
{
vector< vector<unsigned int> > vecNminusTwo = {{}};
vector< vector<unsigned int> > vecNminusOne = {{1}};
vector< vector<unsigned int> > vecResult;
for (int i = 2; i <= number; i++)
{
// Empty the result vector to hold fresh set
vecResult.clear();
// Append '2' to all N-2 ladder positions
appendElementToVector(vecNminusTwo, 2, vecResult);
// Append '1' to all N-1 ladder positions
appendElementToVector(vecNminusOne, 1, vecResult);
vecNminusTwo = vecNminusOne;
vecNminusOne = vecResult;
}
printVecOfVec(vecResult);
}
int main()
{
ladderDynamic(6);
return 0;
}

may be I am wrong.. but it should be :
S(1) =0
S(2) =1
Here We are considering permutations so in that way
S(3) =3
S(4) =7

How to find if 3 numbers in a set of size N exactly sum up to M

I want to know how I can implement a better solution than O(N^3). Its similar to the knapsack and subset problems. In my question N<=8000, so i started computing sums of pairs of numbers and stored them in an array. Then I would binary search in the sorted set for each (M-sum[i]) value but the problem arises how will I keep track of the indices which summed up to sum[i]. I know I could declare extra space but my Sums array already has a size of 64 million, and hence I couldn't complete my O(N^2) solution. Please advice if I can do some optimization or if I need some totally different technique.

You could benefit from some generic tricks to improve the performance of your algorithm.
1) Don't store what you use only once
It is a common error to store more than you really need. Whenever your memory requirement seem to blow up the first question to ask yourself is Do I really need to store that stuff ? Here it turns out that you do not (as Steve explained in comments), compute the sum of two numbers (in a triangular fashion to avoid repeating yourself) and then check for the presence of the third one.
We drop the O(N**2) memory complexity! Now expected memory is O(N).
2) Know your data structures, and in particular: the hash table
Perfect hash tables are rarely (if ever) implemented, but it is (in theory) possible to craft hash tables with O(1) insertion, check and deletion characteristics, and in practice you do approach those complexities (tough it generally comes at the cost of a high constant factor that will make you prefer so-called suboptimal approaches).
Therefore, unless you need ordering (for some reason), membership is better tested through a hash table in general.
We drop the 'log N' term in the speed complexity.
With those two recommendations you easily get what you were asking for:
Build a simple hash table: the number is the key, the index the satellite data associated
Iterate in triangle fashion over your data set: for i in [0..N-1]; for j in [i+1..N-1]
At each iteration, check if K = M - set[i] - set[j] is in the hash table, if it is, extract k = table[K] and if k != i and k != j store the triple (i,j,k) in your result.
If a single result is sufficient, you can stop iterating as soon as you get the first result, otherwise you just store all the triples.

There is a simple O(n^2) solution to this that uses only O(1)* memory if you only want to find the 3 numbers (O(n) memory if you want the indices of the numbers and the set is not already sorted).
First, sort the set.
Then for each element in the set, see if there are two (other) numbers that sum to it. This is a common interview question and can be done in O(n) on a sorted set.
The idea is that you start a pointer at the beginning and one at the end, if your current sum is not the target, if it is greater than the target, decrement the end pointer, else increment the start pointer.
So for each of the n numbers we do an O(n) search and we get an O(n^2) algorithm.
*Note that this requires a sort that uses O(1) memory. Hell, since the sort need only be O(n^2) you could use bubble sort. Heapsort is O(n log n) and uses O(1) memory.

Create a "bitset" of all the numbers which makes it constant time to check if a number is there. That is a start.
The solution will then be at most O(N^2) to make all combinations of 2 numbers.
The only tricky bit here is when the solution contains a repeat, but it doesn't really matter, you can discard repeats unless it is the same number 3 times because you will hit the "repeat" case when you pair up the 2 identical numbers and see if the unique one is present.
The 3 times one is simply a matter of checking if M is divisible by 3 and whether M/3 appears 3 times as you create the bitset.
This solution does require creating extra storage, up to MAX/8 where MAX is the highest number in your set. You could use a hash table though if this number exceeds a certain point: still O(1) lookup.

This appears to work for me...
#include <iostream>
#include <set>
#include <algorithm>
using namespace std;
int main(void)
{
set<long long> keys;
// By default this set is sorted
set<short> N;
N.insert(4);
N.insert(8);
N.insert(19);
N.insert(5);
N.insert(12);
N.insert(35);
N.insert(6);
N.insert(1);
typedef set<short>::iterator iterator;
const short M = 18;
for(iterator i(N.begin()); i != N.end() && *i < M; ++i)
{
short d1 = M - *i; // subtract the value at this location
// if there is more to "consume"
if (d1 > 0)
{
// ignore below i as we will have already scanned it...
for(iterator j(i); j != N.end() && *j < M; ++j)
{
short d2 = d1 - *j; // again "consume" as much as we can
// now the remainder must eixst in our set N
if (N.find(d2) != N.end())
{
// means that the three numbers we've found, *i (from first loop), *j (from second loop) and d2 exist in our set of N
// now to generate the unique combination, we need to generate some form of key for our keys set
// here we take advantage of the fact that all the numbers fit into a short, we can construct such a key with a long long (8 bytes)
// the 8 byte key is made up of 2 bytes for i, 2 bytes for j and 2 bytes for d2
// and is formed in sorted order
long long key = *i; // first index is easy
// second index slightly trickier, if it's less than j, then this short must be "after" i
if (*i < *j)
key = (key << 16) | *j;
else
key |= (static_cast<int>(*j) << 16); // else it's before i
// now the key is either: i | j, or j | i (where i & j are two bytes each, and the key is currently 4 bytes)
// third index is a bugger, we have to scan the key in two byte chunks to insert our third short
if ((key & 0xFFFF) < d2)
key = (key << 16) | d2; // simple, it's the largest of the three
else if (((key >> 16) & 0xFFFF) < d2)
key = (((key << 16) | (key & 0xFFFF)) & 0xFFFF0000FFFFLL) | (d2 << 16); // its less than j but greater i
else
key |= (static_cast<long long>(d2) << 32); // it's less than i
// Now if this unique key already exists in the hash, this won't insert an entry for it
keys.insert(key);
}
// else don't care...
}
}
}
// tells us how many unique combinations there are
cout << "size: " << keys.size() << endl;
// prints out the 6 bytes for representing the three numbers
for(set<long long>::iterator it (keys.begin()), end(keys.end()); it != end; ++it)
cout << hex << *it << endl;
return 0;
}
Okay, here is attempt two: this generates the output:
start: 19
size: 4
10005000c
400060008
500050008
600060006
As you can see from there, the first "key" is the three shorts (in hex), 0x0001, 0x0005, 0x000C (which is 1, 5, 12 = 18), etc.
Okay, cleaned up the code some more, realised that the reverse iteration is pointless..
My Big O notation is not the best (never studied computer science), however I think the above is something like, O(N) for outer and O(NlogN) for inner, reason for log N is that std::set::find() is logarithmic - however if you replace this with a hashed set, the inner loop could be as good as O(N) - please someone correct me if this is crap...

I combined the suggestions by #Matthieu M. and #Chris Hopman, and (after much trial and error) I came up with this algorithm that should be O(n log n + log (n-k)! + k) in time and O(log(n-k)) in space (the stack). That should be O(n log n) overall. It's in Python, but it doesn't use any Python-specific features.
import bisect
def binsearch(r, q, i, j): # O(log (j-i))
return bisect.bisect_left(q, r, i, j)
def binfind(q, m, i, j):
while i + 1 < j:
r = m - (q[i] + q[j])
if r < q[i]:
j -= 1
elif r > q[j]:
i += 1
else:
k = binsearch(r, q, i + 1, j - 1) # O(log (j-i))
if not (i < k < j):
return None
elif q[k] == r:
return (i, k, j)
else:
return (
binfind(q, m, i + 1, j)
or
binfind(q, m, i, j - 1)
)
def find_sumof3(q, m):
return binfind(sorted(q), m, 0, len(q) - 1)

Not trying to boast about my programming skills or add redundant stuff here.
Just wanted to provide beginners with an implementation in C++.
Implementation based on the pseudocode provided by Charles Ma at Given an array of numbers, find out if 3 of them add up to 0.
I hope the comments help.
#include <iostream>
using namespace std;
void merge(int originalArray[], int low, int high, int sizeOfOriginalArray){
// Step 4: Merge sorted halves into an auxiliary array
int aux[sizeOfOriginalArray];
int auxArrayIndex, left, right, mid;
auxArrayIndex = low;
mid = (low + high)/2;
right = mid + 1;
left = low;
// choose the smaller of the two values "pointed to" by left, right
// copy that value into auxArray[auxArrayIndex]
// increment either left or right as appropriate
// increment auxArrayIndex
while ((left <= mid) && (right <= high)) {
if (originalArray[left] <= originalArray[right]) {
aux[auxArrayIndex] = originalArray[left];
left++;
auxArrayIndex++;
}else{
aux[auxArrayIndex] = originalArray[right];
right++;
auxArrayIndex++;
}
}
// here when one of the two sorted halves has "run out" of values, but
// there are still some in the other half; copy all the remaining values
// to auxArray
// Note: only 1 of the next 2 loops will actually execute
while (left <= mid) {
aux[auxArrayIndex] = originalArray[left];
left++;
auxArrayIndex++;
}
while (right <= high) {
aux[auxArrayIndex] = originalArray[right];
right++;
auxArrayIndex++;
}
// all values are in auxArray; copy them back into originalArray
int index = low;
while (index <= high) {
originalArray[index] = aux[index];
index++;
}
}
void mergeSortArray(int originalArray[], int low, int high){
int sizeOfOriginalArray = high + 1;
// base case
if (low >= high) {
return;
}
// Step 1: Find the middle of the array (conceptually, divide it in half)
int mid = (low + high)/2;
// Steps 2 and 3: Recursively sort the 2 halves of origianlArray and then merge those
mergeSortArray(originalArray, low, mid);
mergeSortArray(originalArray, mid + 1, high);
merge(originalArray, low, high, sizeOfOriginalArray);
}
//O(n^2) solution without hash tables
//Basically using a sorted array, for each number in an array, you use two pointers, one starting from the number and one starting from the end of the array, check if the sum of the three elements pointed to by the pointers (and the current number) is >, < or == to the targetSum, and advance the pointers accordingly or return true if the targetSum is found.
bool is3SumPossible(int originalArray[], int targetSum, int sizeOfOriginalArray){
int high = sizeOfOriginalArray - 1;
mergeSortArray(originalArray, 0, high);
int temp;
for (int k = 0; k < sizeOfOriginalArray; k++) {
for (int i = k, j = sizeOfOriginalArray-1; i <= j; ) {
temp = originalArray[k] + originalArray[i] + originalArray[j];
if (temp == targetSum) {
return true;
}else if (temp < targetSum){
i++;
}else if (temp > targetSum){
j--;
}
}
}
return false;
}
int main()
{
int arr[] = {2, -5, 10, 9, 8, 7, 3};
int size = sizeof(arr)/sizeof(int);
int targetSum = 5;
//3Sum possible?
bool ans = is3SumPossible(arr, targetSum, size); //size of the array passed as a function parameter because the array itself is passed as a pointer. Hence, it is cummbersome to calculate the size of the array inside is3SumPossible()
if (ans) {
cout<<"Possible";
}else{
cout<<"Not possible";
}
return 0;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Binary Search Index Questions [duplicate] - c++

Related

Why M = L + ((R - L) / 2) instead of M=(L+R)/2 avoid overflow in C++?

Is there an expression using modulo to do backwards wrap-around ("reverse overflow")?

Find the lowest integer that matches equation in C++ [duplicate]

Finding all paths down stairs?

How to find if 3 numbers in a set of size N exactly sum up to M

Categories

Resources