Algorithm to determine that a 2x2 square contains the numbers 1-4 (no repeats) - c++

What would be an applicable C++ algorithm to determine that a 2x2 square (say, represented by a 1d vector) contains the numbers 1-4? I can't think of this, although it is quite simple. I would prefer to not have a giant if statement.
Examples of appropriate squares
1 2
3 4
2 3
4 1
1 3
2 4
Inappropriate squares:
1 1
2 3
1 2
3 3
1 2
4 4

I would probably start with an unsigned int set to 0 (e.g., call it x). I'd assign one bit in x to each possible input number (e.g., 1->bit 0, 2->bit 1, 3->bit 2, 4->bit 3). As I read the numbers, I'd verify that the number was in range, and if it was, set the corresponding bit in x.
At the end, if all the numbers are different, I should have 4 bits of x set. If any of the numbers was repeated, some of those bits won't be set.
If you prefer, you could use std::bitset or std::vector<bool> instead of the bits in a single number. In this case a single number is probably easier though, because you can verify the presence of all four desired bits with a single comparison.

bool valid(unsigned[] square) {
unsigned r = 0;
for(int i = 0; i < 4; ++i)
r |= 1 << square[i];
return r == 30;
}
Just set the appropriate bits, and check whether all are set at the end.
Though it assumes the numbers are smaller than sizeof(unsigned) * CHAR_BIT.

Well if it's represented by a vector and we just want something that works:
bool isValidSquare(const std::vector<int>& square) {
if (square.size() == 4) {
std::set<int> uniqs(square.begin(), square.end());
return uniqs.count(1) && uniqs.count(2) && uniqs.count(3) && uniqs.count(4);
}
return false;
}

Create a static bitset for corresponding bit 1-4 set, and another one with all bits unset.
Traverse through the vector, setting the respective bit in the 2nd set for current vector element.
Compare the 1st and 2nd set. If they match, the square is appropriate. Otherwise, it isn't.

You can use the standard library for this
#include <iostream>
#include <algorithm>
#include <vector>
int main()
{
std::vector<int> input{1,5,2,4};
sort(std::begin(input), std::end(input));
std::cout << std::boolalpha
<< std::equal(std::begin(input), std::end(input), std::begin({1,2,3,4}));
}

Assuming your inputs are only 1 to 4 numbers (assumption based on your examples), you can actually xor them and check if the result is 4 :
if ((tab[0] ^ tab[1] ^ tab[2] ^ tab[3]) == 4)
// Matches !
I had the feeling this would work, but am too tired to prove it mathematically, but this python program will prove this is right :
numbers = [1, 2, 3, 4]
good_results = []
bad_results = []
for i in numbers:
for j in numbers:
for k in numbers:
for l in numbers:
res = i ^ j ^ k ^ l
print "%i %i %i %i -> %i" % (i, j, k, l, res)
if len(set([i, j, k, l])) == 4: # this condition checks if i, j, k and l are different
good_results.append(res)
else:
bad_results.append(res)
print set(good_results) # => set([4])
print set(bad_results) # => set([0, 1, 2, 3, 5, 6, 7])

Related

Maximize XOR Equation

Problem statement:
Given an array of n elements and an integer k, find an integer x in
the range [0,k] such that Xor-sum(x) is maximized. Print the maximum
value of the equation.
Xor-sum(x)=(x XOR A1)+(x XOR A[2])+(x XOR A[3])+…………..+(x XOR A[N])
Input Format
The first line contains integer N denoting the number of elements in
A. The next line contains an integer, k, denoting the maximum value
of x. Each line i of the N subsequent lines(where 0<=i<=N) contains
an integer describing Ai.
Constraints
1<=n<=10^5
0<=k<=10^9
0<=A[i]<=10^9
Sample Input
3
7
1
6
3
Sample Output
14
Explanation
Xor_sum(4)=(4^1)+(4^6)+(4^3)=14.
This problem was asked in Infosys requirement test. I was going through previous year papers &
I came across this problem.
I was only able to come up with a brute-force solution which is just to calculate the equation
for every x in the range [0,k] and print the maximum. But, the solution won't work for the
given constraints.
My solution
#include <bits/stdc++.h>
using namespace std;
int main()
{
int n, k, ans = 0;
cin >> n >> k;
vector<int> a(n);
for (int i = 0; i < n; i++) cin >> a[i];
for (int i = 0; i <= k; i++) {
int temp = 0;
for (int j = 0; j < n; j++) {
temp += (i ^ a[j]);
}
ans = max(temp, ans);
}
cout << ans;
return 0;
}
I found the solution on a website. I was unable to understand what the code does but, this solution gives incorrect answer for some test cases.
Scroll down to question 3
The trick here is that XOR works on bits in parallel, independently. You can optimize each bit of x. Brute-forcing this takes 2*32 tries, given the constraints.
As said in other comments each bit of x will give an independent contribution to the sum, so the first step is to calculate the added value for each possible bit.
To do this for the i-th bit of x count the number of 0s and 1s in the same position of each number in the array, if the difference N0 - N1 is positive then the added value is also positive and equal to (N0-N1) * 2^i, let's call such bits "useful".
The number x will be a combination of useful bits only.
Since k is not in the form 2^n - 1, we need a strategy to find the best combination (if you don't want to use brute force on the k possible values).
Consider then the binary representation of k and loop over its bits starting from the MSB, initializing two variables: CAV (current added value) = 0, BAV (best added value) = 0.
If the current bit is 0 loop over.
If the current bit is 1:
a) calculate the AV sum of all useful bits with lower index plus the CAV, if the result is greater then the BAV then replace BAV
b) if the current bit is not useful quit loop
c) add the current bit added value to CAV
When the loop is over, if CAV is greater than BAV replace BAV
EDIT: A sample implementation (in Java, sorry :) )
public class XorSum {
public static void main(String[] args) {
Scanner sc=new Scanner(System.in);
int n=sc.nextInt();
int k=sc.nextInt();
int[] a=new int[n];
for (int i=0;i<n;i++) {
a[i]=sc.nextInt();
}
//Determine the number of bits to represent k (position of most significant 1 + 1)
int msb=0;
for (int kcopy=k; kcopy!=0; kcopy=kcopy>>>1) {
msb++;
}
//Compute the added value of each possible bit in x
int[] av=new int[msb];
int bmask=1;
for (int bit=0;bit<msb;bit++) {
int count0=0;
for (int i=0;i<n;i++) {
if ((a[i]&bmask)==0) {
count0++;
}
}
av[bit]=(count0*2-n)*bmask;
bmask = bmask << 1;
}
//Accumulated added value, the value of all positive av bits up to the index
int[] aav=new int[msb];
for (int bit=0;bit<msb;bit++) {
if (av[bit]>0) {
aav[bit]=av[bit];
}
if (bit>0) {
aav[bit]+=aav[bit-1];
}
}
//Explore the space of possible combinations moving on the k boundary
int cval=0;
int bval=0;
bmask = bmask >>> 1;
//Start from the msb
for (int bit=msb-1;bit>=0;bit--) {
//Exploring the space of bit combination we have 3 possible cases:
//bit of k is 0, then we must choose 0 as well, setting it to 1 will get x to be greater than k, so in this case just loop over
if ((k&bmask)==0) {
continue;
}
//bit of k is 1, we can choose between 0 and 1:
//- choosing 0, we can immediately explore the complete branch considering that all following bits can be set to 1, so just set to 1 all bits with positive av
// and get the meximum possible value for this branch
int val=cval+(bit>0?aav[bit]:0);
if (val>bval) {
bval=val;
}
//- choosing 1, if the bit has no positive av, then it's forced to 0 and the solution is found on the other branch, so we can stop here
if (av[bit]<=0) break;
//- choosing 1, with a positive av, then store the value and go on with this branch
cval+=av[bit];
}
if (cval>bval) {
bval=cval;
}
//Final sum
for (int i=0;i<n;i++) {
bval+=a[i];
}
System.out.println(bval);
}
}
I think you can consider solving for each bit. The number X should be the one that can turn on many high-order bits in the array. So you can count the number of bits 1 for 2^0, 2^1, ... And for each position in the 32 bits consider turning on the ones that many number has that position to be bit 0.
Combining this with the limit K should give you an answer that runs in O(log K) time.
Assuming k is unbounded, this problem is trivial.
For each bit (assuming 64-bit words there would be 64 for example) accumulate the total count of 1's and 0's in all values in the array (for that bit), with c1_i and c0_i representing the former and latter respectively for bit i.
Then define each bit b_i in x as
x_i = 1 if c0_i > c1_i else 0
Constructing x as described above is guaranteed to give you the value of x that maximizes the sum of interest.
When k is specific number, this can be solved using a dynamic programming solution. To understand how, first derive a recurrence.
Let z_0,z_1,...,z_n be the positions of ones occuring in k's binary representation with z_0 being the most significant position.
Let M[t] represent the maximum sum possible given the problem's array and defining any x such that x < t.
Important note: the optimal value of M[t] for t a power of 2 is obtained by following the procedure described above for an unbounded k, but limiting the largest bit used.
To solve this problem, we want to find
M[k] = max(M[2^z_0],M[k - 2^z_0] + C_0)
where C_i is defined to be the contribution to the final sum by setting the position z_i to one.
This of course continues as a recursion, with the next step being:
M[k - 2^z_0] = max(M[2^z_1],M[k - 2^z_0 - 2^z_1] + C_1)
and so on and so forth. The dynamic programming solution arises by converting this recursion to the appropriate DP algorithm.
Note, that due to the definition of M[k], it is still necessary to check if the sum of x=k is greater than M[k], as it may still be so, but this requires one pass.
At bit level it is simple 0 XOR 0, 1 XOR 1 = 0 and last one 0 XOR 1 = 1, but when these bit belongs to a number XOR operations have addition and subtraction effect. For example if third bit of a number is set and num XOR with 4 (0100) which also have third bit set then result would be subtraction from number by 2^(3-1), for example num = 5 then 0101 XOR 0100 = 0001, 4 subtracted in 5 , Similarly if third bit of a number is not set and num XOR with 4 then result would be addition for example num = 2 then 0010 XOR 0100 = 0101, 4 will be added in 2. Now let’s see this problem,
This problem can’t be solved by applying XOR on each number individually, rather the approach to solve this problem is Perform XOR on particular bit of all numbers, in one go! . Let’s see how it can be done?
Fact 1: Let’s consider we have X and we want to perform XOR on all numbers with X and if we know second bit of X is set, now suppose somehow we also know that how many numbers in all numbers have second bit set then we know answer 1 XOR 1 = 0 and we don’t have to perform XOR on each number individually.
Fact 2: From fact 1, we know how many numbers have a particular bit set, let’s call it M and if X also have that particular bit set then M * 2^(pos -1) will be subtracted from sum of all numbers. If N is total element in array than N - M numbers don’t have that particular bit set and due to it (N – M) * 2^(pos-1) will be added in sum of all numbers.
From Fact 1 and Fact 2 we can calculate overall XOR effect on a particular bit on all Numbers by effect = (N – M)* 2^(pos -1) – (M * 2^(pos -1)) and can perform the same for all bits.
Now it’s time to see above theory in action, if we have array = {1, 6, 3}, k = 7 then,
1 = 0001 (There are total 32 bits but I am showing only relevant bits other bits are zero)
6 = 0110
3 = 0011
So our bit count list = [0, 1, 2, 2] as you can see 1 and 3 have first bit set, 6 and 3 have second bit set and only 6 have third bit set.
X = 0, …, 7 but X = 0 have effect = 0 on sum because if bit is not set then it doesn’t not affect other bit in XOR operation, so let’s star from X = 1 which is 0001,
[0, 1, 2, 2] = count list,
[0, 0, 0, 1] = X
As it is visible in count list two numbers have first bit set and X also have first bit set, it means 2 * 2^(1 – 1) will be subtract in sum and total numbers in array are three, so (3 – 2) * 2^(1-1) will be added in sum. Conclusion is XOR of first bit is, effect = (3 – 2) * 2^(1-1) - 2 * 2^(1 – 1) = 1 – 2 = -1. It is also overall effect by X = 1 because it only has first bit set and rest of bits are zero. At this point we compare effect produced by X = 1 with X = 0 and -1 < 0 which means X = 1 will reduce sum of all numbers by -1 but X = 0 will not deduce sum of all numbers. So until now X = 0 will produce max sum.
The way XOR is performed for X = 1 can be performed for all other values and I would like to jump directly to X = 4 which is 0100
[0, 1, 2, 2] = count list,
[0, 1, 0, 0] = X
As it is visible X have only third bit set and only one number in array have first bit set, it means 1 * 2^(3 – 1 ) will be subtracted and (3 – 1) * 2^(3-1) will be added and overall effect = (3 – 1) * 2^(3-1) - 1 * 2^(3 – 1 ) = 8 – 4 = 4. At this point we compare effect of X = 4 with known max effect which is effect = 0 so 4 > 0 and due to this X = 4 will produce max sum and we considered it. When you perform this for all X = 0,…,7, you will find X = 4 will produce max effect on sum, so the answer is X = 4.
So
(x XOR arr[0]) + ( x XOR arr[1]) +….. + (x XOR arr[n]) = effect + sum(arr[0] + sum[1]+ …. + arr[n])
Complexity is,
O(32 n) to find for all 32 bits, how many number have a particular bit set, plus,
O(32 k) to find effect of all X in [0, k],
Complexity = O(32 n) + O(32 k) = O(c n) + O(c k), here c is constant,
finally
Complexity = O(n)
#include <iostream>
#include <cmath>
#include <bitset>
#include <vector>
#include <numeric>
std::vector<std::uint32_t> bitCount(const std::vector<std::uint32_t>& numList){
std::vector<std::uint32_t> countList(32, 0);
for(std::uint32_t num : numList){
std::bitset<32> bitList(num);
for(unsigned i = 0; i< 32; ++i){
if(bitList[i]){
countList[i] += 1;
}
}
}
return countList;
}
std::pair<std::uint32_t, std::int64_t> prefXAndMaxEffect(std::uint32_t n, std::uint32_t k,
const std::vector<std::uint32_t>& bitCountList){
std::uint32_t prefX = 0;
std::int64_t xorMaxEffect = 0;
std::vector<std::int64_t> xorBitEffect(32, 0);
for(std::uint32_t x = 1; x<=k; ++x){
std::bitset<32> xBitList(x);
std::int64_t xorEffect = 0;
for(unsigned i = 0; i< 32; ++i){
if(xBitList[i]){
if(0 != xorBitEffect[i]){
xorEffect += xorBitEffect[i];
}
else{
std::int64_t num = std::exp2(i);
xorBitEffect[i] = (n - bitCountList[i])* num - (bitCountList[i] * num);
xorEffect += xorBitEffect[i];
}
}
}
if(xorEffect > xorMaxEffect){
prefX = x;
xorMaxEffect = xorEffect;
}
}
return {prefX, xorMaxEffect};
}
int main(int , char *[]){
std::uint32_t k = 7;
std::vector<std::uint32_t> numList{1, 6, 3};
std::pair<std::uint32_t, std::int64_t> xAndEffect = prefXAndMaxEffect(numList.size(), k, bitCount(numList));
std::int64_t sum = 0;
sum = std::accumulate(numList.cbegin(), numList.cend(), sum) + xAndEffect.second;
std::cout<< sum<< '\n';
}
Output :
14

Number of pairs with constant difference and bitwise AND zero

How to find the number of pairs whose difference is a given constant and their bitwise AND is zero? Basically, all (x,y) such that
x-y = k; where k is a given constant and
x&y = 0;
An interesting problem.
Let kn-1...k1k0 be the the binary representation of k.
Let l be the index of the smallest i such that ki=1
We can remark that a potential pair of solutions x and y must have all their bits i, i<l at zero.
Otherwise the only way to have a difference x-y with its ith bit unset would be to have xi=yi=1 and x&y will not have its ith bit unset.
Now we arrive at the first bit at one at index l.
The situation is more complex, as we have several ways to have this bit set in the result of x-y.
For that we must consider the set of bits l..m such that ki=ki+1=ki+2=...=1 ∀l≤i<m and km=0
For instance, if l=0 and m=1, the two LSB of k are 01 and we can get this result by computing either 01-00 (1-0) or 10-01 (2-1). In either case, the result is correct (1) and the bits of x and y are opposite and give a zero when anded.
When the sequence is composed of several ones, the replacement must done from LSB for every pair of consecutive ones.
Here is an example. To simplify, we assume that the sequence starts at bit 0, but the generalization is immediate :
k=0111
Trivial solution x=k=0111 y=0=0000
Rewrite 1 at LSB as 2-1: add 1 to x and 1 to y
x=0111+0001=1000=8 y=0000+0001=0001
Rewrite bit at 1 at index 1 (21) as 4-2: add 2 to x and add 2 to y
x=0111+0010=1011 y=0000+0010=0010
Rewrite bit at 1 at index 2 (22) as 4=8-4: add 4 to x and add 4 to y
x=0111+0100=1011 y=0000+0100=0100
So, for a sequence of ones followed by a zero :
Compute the trivial solution where x=<sequence> and y=0
for every one in the sequence
let i be the position of this one
generate a new solution by adding 2^i to x and y of the trivial solution
To resume one must decompose the number in two kind of sequences, starting at LSB
* zeroes is a sequence of consecutive zeroes
* ones is a sequence of ones followed by a zero
The results are obtained by replacing
* zeroes by a set of zeroes
* ones by adding 0, 1, 2, 4, 2i to the trivial solution 01111..11/000...000
Example :
k = 22 = 16+4+2 = 0 0 0 1 0 1 1 0
Rewrite first sequence
011 -> 011/000 (trivial solution)
100/001 (trivial solution+1)
101/010 (trivial solution+2)
Rewrite second sequence
01 -> 01/00 (trivial solution)
10/01 (trivial solution + 1)
And so there are 3*2=6 solutions
010110/000000 22/0
011000/000010 24/2
011010/000100 26/4
100110/010000 38/16
101000/010010 40/18
101010/010100 42/20
Java implementation would be like this ...
import java.util.ArrayList;
public class FindPairs {
public static void main(String args[]) {
int arr[] = {1,3,4,5,6,9};
int k = 3;
ArrayList<String> out = new ArrayList<String>();
for(int i=0; i<arr.length; i++) {
for(int j=i+1; j<arr.length; j++) {
if((Math.abs(arr[i]-arr[j]) == k) && ((arr[i]&arr[j]) == 0)) {
out.add("("+arr[i]+","+arr[j]+")");
}
}
}
if(out.size()>0) {
for(String pair:out) {
System.out.println(pair);
}
}else {
System.out.println("No such pair !");
}
}
}

Is there any number repeated in the array?

There's array of size n. The values can be between 0 and (n-1) as the indices.
For example: array[4] = {0, 2, 1, 3}
I should say if there's any number that is repeated more than 1 time.
For example: array[5] = {3,4,1,2,4} -> return true because 4 is repeated.
This question has so many different solutions and I would like to know if this specific solution is alright (if yes, please prove, else refute).
My solution (let's look at the next example):
array: indices 0 1 2 3 4
values 3 4 1 2 0
So I suggest:
count the sum of the indices (4x5 / 2 = 10) and check that the values' sum (3+4+1+2+0) is equal to this sum. if not, there's repeated number.
in addition to the first condition, get the multiplication of the indices(except 0. so: 1x2x3x4) and check if it's equal to the values' multiplication (except 0, so: 3x4x1x2x0).
=> if in each condition, it's equal then I say that there is NO repeated number. otherwise, there IS a repeated number.
Is it correct? if yes, please prove it or show me a link. else, please refute it.
Why your algorithm is wrong?
Your solution is wrong, here is a counter example (there may be simpler ones, but I found this one quite quickly):
int arr[13] = {1, 1, 2, 3, 4, 10, 6, 7, 8, 9, 10, 11, 6};
The sum is 78, and the product is 479001600, if you take the normal array of size 13:
int arr[13] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
It also has a sum of 78 and a product of 479001600 so your algorithm does not work.
How to find counter examples?1
To find a counter example2 3:
Take an array from 0 to N - 1;
Pick two even numbers3 M1 > 2 and M2 > 2 between 0 and N - 1 and halve them;
Replace P1 = M1/2 - 1 by 2 * P1 and P2 = M2/2 + 1 by 2 * P2.
In the original array you have:
Product = M1 * P1 * M2 * P2
Sum = 0 + M1 + P1 + M2 + P2
= M1 + M1/2 - 1 + M2 + M2/2 + 1
= 3/2 * (M1 + M2)
In the new array you have:
Product = M1/2 * 2 * P1 + M2/2 * 2 * P2
= M1 * P1 * M2 * P2
Sum = M1/2 + 2P1 + M2/2 + 2P2
= M1/2 + 2(M1/2 - 1) + M2/2 + 2(M2/2 + 1)
= 3/2 * M1 - 2 + 3/2 * M2 + 2
= 3/2 * (M1 + M2)
So both array have the same sum and product, but one has repeated values, so your algorithm does not work.
1 This is one method of finding counter examples, there may be others (there are probably others).
2 This is not exactly the same method I used to find the first counter example - In the original method, I used only one number M and was using the fact that you can replace 0 by 1 without changing the product, but I propose a more general method here in order to avoid argument such as "But I can add a check for 0 in my algorithm.".
3 That method does not work with small array because you need to find 2 even numbers M1 > 2 and M2 > 2 such that M1/2 != M2 (and reciprocally) and M1/2 - 1 != M2/2 + 1, which (I think) is not possible for any array with a size lower than 14.
What algorithms do work?4
Algorithm 1: O(n) time and space complexity.
If you can allocate a new array of size N, then:
template <std::size_t N>
bool has_repetition (std::array<int, N> const& array) {
std::array<bool, N> rep = {0};
for (auto v: array) {
if (rep[v]) {
return true;
}
rep[v] = true;
}
return false;
}
Algorithm 2: O(nlog(n)) time complexity and O(1) space complexity, with a mutable array.
You can simply sort the array:
template <std::size_t N>
bool has_repetition (std::array<int, N> &array) {
std::sort(std::begin(array), std::end(array));
auto it = std::begin(array);
auto ne = std::next(it);
while (ne != std::end(array)) {
if (*ne == *it) {
return true;
}
++it; ++ne;
}
return false;
}
Algorithm 3: O(n^2) time complexity and O(1) space complexity, with non mutable array.
template <std::size_t N>
bool has_repetition (std::array<int, N> const& array) {
for (auto it = std::begin(array); it != std::end(array); ++it) {
for (auto jt = std::next(it); jt != std::end(array); ++jt) {
if (*it == *jt) {
return true;
}
}
}
return false;
}
4 These algorithms do work, but there may exist other ones that performs better - These are only the simplest ones I could think of given some "restrictions".
What's wrong with your method?
Your method computes some statistics of the data and compares them with those expected for a permutation (= correct answers). While a violation of any of these comparisons is conclusive (the data cannot satisfy the constraint), the inverse is not necessarily the case. You only look at two statistics, and these are too few for sufficiently large data sets. Owing to the fact that the data are integer, the smallest number of data for which your method may fail is larger than 3.
If you are searching duplicates in your array there is simple way:
int N =5;
int array[N] = {1,2,3,4,4};
for (int i = 0; i< N; i++){
for (int j =i+1; j<N; j++){
if(array[j]==array[i]){
std::cout<<"DUPLICATE FOUND\n";
return true;
}
}
}
return false;
Other simple way to find duplicates is using the std::set container for example:
std::set<int> set_int;
set_int.insert(5);
set_int.insert(5);
set_int.insert(4);
set_int.insert(4);
set_int.insert(5);
std::cout<<"\nsize "<<set_int.size();
the output will be 2, because there is 2 individual values
A more in depth explanation why your algorithm is wrong:
count the sum of the indices (4x5 / 2 = 10) and check that the values' sum (3+4+1+2+0) is equal to this sum. if not, there's repeated number.
Given any array A which has no duplicates, it is easy to create an array that meets your first requirement but now contains duplicates. Just take take two values and subtract one of them by some value v and add that value to the other one. Or take multiple values and make sure the sum of them stays the same. (As long as new values are still within the 0 .. N-1 range.) For N = 3 it is already possible to change {0,1,2} to {1,1,1}. For an array of size 3, there are 7 compositions that have correct sum, but 1 is a false positive. For an array of size 4 there are 20 out of 44 have duplicates, for an array of size 5 that's 261 out of 381, for an array of size 6 that's 3612 out of 4332, and so on. It is save to say that the number of false positives grows much faster than real positives.
in addition to the first condition, get the multiplication of the indices(except 0. so: 1x2x3x4) and check if it's equal to the values' multiplication (except 0, so: 3x4x1x2x0).
The second requirement involves the multiplication of all indices above 0. It is easy to realize this is could never be a very strong restriction either. As soon as one of the indices is not prime, the product of all indices is no longer uniquely tied to the multiplicands and a list can be constructed of different values with the same result. E.g. a pair of 2 and 6 can be replaced with 3 and 4, 2 and 9 can be replaced with 6 and 3 and so on. Obviously the number of false positives increases as the array-size gets larger and more non-prime values are used as multiplicands.
None of these requirements is really strong and the cannot compensate for the other. Since 0 is not even considered for the second restriction a false positive can be created fairly easy for arrays starting at size 5. any pair of 0 and 4 can simply be replaced with two 2's in any unique array, for example {2, 1, 2, 3, 2}
What you would need, is to have a result that is uniquely tight to the occurring values. You could tweak your second requirement to a more complex approach and skip over the non-prime values and take 0 into account. For example you could use the first prime as multiplicand (2) for 0, use 3 as multiplicand for 1, 5 as multiplicand for 2, and so on. That would work (you would not need the first requirement), but this approach would be overly complex. An simpler way to get a unique result would be to OR the i-th bit for each value (0 => 1 << 0, 1 => 1 << 1, 2 => 1 << 2, and so on. (Obviously it is faster to check wether a bit was already set by a reoccurring value, rather than wait for the final result. And this is conceptually the same as using a bool array/vector from the other examples!)

Generating bit combination without repetitions (not permunation)

Here is my previous question about finding next bit permutation. It occurs to me that I have to modify my code to achieve something similiar to next bit permutation, but quite different.
I am coding information about neighbors of vertex in graph in bit representation of int. For example if n = 4 (n - graph vertices) and graph is full, my array of vertices looks like:
vertices[0]=14 // 1110 - it means vertex no. 1 is connected with vertices no. 2, 3, and 4
vertices[1]=13 // 1101 - it means vertex no. 2 is connected with vertices no. 1, 3, and 4
vertices[2]=11 // 1011 - it means vertex no. 3 is connected with vertices no. 1, 2, and 4
vertices[3]=7 // 0111 - it means vertex no. 4 is connected with vertices no. 1, 2, and 3
First (main) for loop is from 0 to 2^n (cause 2^n is number of subsets of a set).
So if n = 4, then there are 16 subsets:
{empty}, {1}, ..., {4}, {0,1}, {0,2}, ..., {3,4}, {0,1,2}, ..., {1,2,3}, {1,2,3,4}
These subsets are represented by index value in for loop
for(int i=0; i < 2^n; ++i) // i - represents value of subset
Let's say n = 4, and actually i = 5 //0101. I'd like to check subsets of this subset, so I would like to check:
0000
0001
0100
0101
Now I'm generating all bit permutation of 1 bit set, then permutation of 2 bits set ... and so on (until I reach BitCount(5) = 2) and I only take permutation I want (by if statement). It's too many unneeded computations.
So my question is, how to generate all possible COMBINATIONS WITHOUT REPETITIONS (n,k) where n - graph vertices and k - number of bits in i (stated above)
My actual code (that generates all bit permutation and selects wrong):
for (int i = 0; i < PowerNumber; i++)
{
int independentSetsSum = 0;
int bc = BitCount(i);
if(bc == 1) independentSetsSum = 1;
else if (bc > 1)
{
for(int j = 1; j <= bc; ++j)
{
unsigned int v = (1 << j) - 1; // current permutation of bits
int bc2 = BitCount(j);
while(v <= i)
{
if((i & v) == v)
for(int neigh = 1; neigh <= bc2; neigh++)
if((v & vertices[GetBitPositionByNr(v, neigh) - 1]) == 0)
independentSetsSum ++;
unsigned int t = (v | (v - 1)) + 1;
v = t | ((((t & -t) / (v & -v)) >> 1) - 1);
}
}
}
}
All of this is because I have to count independent set number of every subset of n.
EDIT
I'd like to do it without creating any arrays or generally I'd like to avoid allocating any memory (neither vectors).
A little bit of an explanation:
n=5 //00101 - it is bit count of a number i - stated above, k=3, numbers in set (number represents bit position set to 1)
{
1, // 0000001
2, // 0000010
4, // 0001000
6, // 0100000
7 // 1000000
}
So correct combination is {1,2,6} // 0100011, but {1,3,6} // 0100101 is a wrong combination. In my code there are plenty of wrong combinations which I have to filter.
Not sure I correctly understand what you exactly want but based from your example (where i==5) you want all the subsets of a given subset.
If it's the case you can directly generate all these subsets.
int subset = 5;
int x = subset;
while(x) {
//at this point x is a valid subset
doStuff(x);
x = (x-1)&subset;
}
doStuff(0) //0 is always valid
Hope this helps.
My first guess to generate all the possible combinations would be the following rules (sorry if it's a bit hard to read)
start from the combination where all the 1s are on the left, all the 0s are on the right
move the leftmost 1 with a 0 on its immediate right to the right
if that bit had a 1 on its immediate left then
move all the 1s on its left all the way to the left
you're finished when you reach the combination with all the 1s on the right, and all the 0s on the left
Applying these rules for n=5 and k=3 would give this:
11100
11010
10110
01110
11001
10101
01101
10011
01011
00111
But that doesn't strikes me as really efficient (and/or elegant).
A better way would be to find a way to iterate through these numbers by flipping only a finite number of bits (i mean, you'd always need to flip O(1) bits to reach the next combination, rather than O(n)), that may allow a more efficient iteration (a bit like the https://en.wikipedia.org/wiki/Gray_code ).
I'll edit or post another andwer if i find better.

Finding missing number using binary search

I am reading book on programming pearls.
Question: Given a sequential file that contains at most four billion
32 bit integers in random order, find a 32-bit integer that isn't in
the file (and there must be at least one missing). This problem has to
be solved if we have a few hundred bytes of main memory and several
sequential files.
Solution: To set this up as a binary search we have to define a range,
a representation for the elements within the range, and a probing
method to determine which half of a range holds the missing integer.
How do we do this?
We'll use as the range a sequence of integers known to contain atleast
one missing element, and we'll represent the range by a file
containing all the integers in it. The insight is that we can probe a
range by counting the elements above and below its midpoint: either
the upper or the lower range has atmost half elements in the total
range. Because the total range has a missing element, the smaller half
must also have a mising element. These are most ingredients of a
binary search algorithm for above problem.
Above text is copy right of Jon Bently from programming pearls book.
Some info is provided at following link
"Programming Pearls" binary search help
How do we search by passes using binary search and also not followed with the example given in above link? Please help me understand logic with just 5 integers rather than million integers to understand logic.
Why don't you re-read the answer in the post "Programming Pearls" binary search help. It explains the process on 5 integers as you ask.
The idea is that you parse each list and break it into 2 (this is where binary part comes from) separate lists based on the value in the first bit.
I.e. showing binary representation of actual numbers
Original List "": 001, 010, 110, 000, 100, 011, 101 => (broken into)
(we remove the first bit and append it to the "name" of the new list)
To form each of the bellow lists we took values starting with [0 or 1] from the list above
List "0": 01, 10, 00, 11 (is formed from subset 001, 010, 000, 011 of List "" by removing the first bit and appending it to the "name" of the new list)
List "1": 10, 00, 01 (is formed from subset 110, 100, 101 of List "" by removing the first bit and appending it to the "name" of the new list)
Now take one of the resulting lists in turn and repeat the process:
List "0" becomes your original list and you break it into
List "0***0**" and
List "0***1**" (the bold numbers are again the 1 [remaining] bit of the numbers in the list being broken)
Carry on until you end up with the empty list(s).
EDIT
Process step by step:
List "": 001, 010, 110, 000, 100, 011, 101 =>
List "0": 01, 10, 00, 11 (from subset 001, 010, 000, 011 of the List "") =>
List "00": 1, 0 (from subset 01, 00 of the List "0") =>
List "000": 0 [final result] (from subset 0 of the List "00")
List "001": 1 [final result] (from subset 1 of the List "00")
List "01": 0, 1 (from subset 10, 11 of the List "0") =>
List "010": 0 [final result] (from subset 0 of the List "01")
List "011": 1 [final result] (from subset 1 of the List "01")
List "1": 10, 00, 01 (from subset 110, 100, 101 of the List "") =>
List "10": 0, 1 (from subset 00, 01 of the List "1") =>
List "100": 0 [final result] (from subset 0 of the List "10")
List "101": 1 [final result] (from subset 1 of the List "10")
List "11": 0 (from subset 10 of the List "1") =>
List "110": 0 [final result] (from subset 0 of the List "11")
List "111": absent [final result] (from subset EMPTY of the List "11")
The positive of this method is that it will allow you to find ANY number of missing numbers in the set - i.e. if more than one is missing.
P.S. AFAIR for 1 single missing number out of the complete range there is even more elegant solution of XOR all numbers.
The idea is to solve easier problem:
Is the missing value in range [minVal, X] or (X, maxVal).
If you know this, you can move X and check again.
For example, you have 3, 4, 1, 5 (2 is missing).
You know that minVal = 1, maxVal = 5.
Range = [1, 5], X = 3, there should be 3 integers in range [1, 3] and 2 in range [4, 5]. There are only 2 in range [1, 3], so you are looking in range [1, 3]
Range = [1, 3], X = 2. There are only 1 value in range [1, 2], so you are looking in range [1, 2]
Range = [1, 2], X = 1. There are no values in range [2, 2] so it is your answer.
EDIT: Some pseudo-C++ code:
minVal = 1, maxVal = 5; //choose correct values
while(minVal < maxVal){
int X = (minVal + maxVal) / 2
int leftNumber = how much in range [minVal, X]
int rightNumber = how much in range [X + 1, maxVal]
if(leftNumber < (X - minVal + 1))maxVal = X
else minVal = X + 1
}
Here's a simple C solution which should illustrate the technique. To abstract away any tedious file I/O details, I'm assuming the existence of the following three functions:
unsigned long next_number (void) reads a number from the file and returns it. When called again, the next number in the file is returned, and so on. Behavior when the end of file is encountered is undefined.
int numbers_left (void) returns a true value if there are more numbers available to be read using next_number(), false if the end of the file has been reached.
void return_to_start (void) rewinds the reading position to the start of the file, so that the next call to next_number() returns the first number in the file.
I'm also assuming that unsigned long is at least 32 bits wide, as required for conforming ANSI C implementations; modern C programmers may prefer to use uint32_t from stdint.h instead.
Given these assumptions, here's the solution:
unsigned long count_numbers_in_range (unsigned long min, unsigned long max) {
unsigned long count = 0;
return_to_start();
while ( numbers_left() ) {
unsigned long num = next_number();
if ( num >= min && num <= max ) {
count++;
}
}
return count;
}
unsigned long find_missing_number (void) {
unsigned long min = 0, max = 0xFFFFFFFF;
while ( min < max ) {
unsigned long midpoint = min + (max - min) / 2;
unsigned long count = count_numbers_in_range( min, midpoint );
if ( count < midpoint - min + 1 ) {
max = midpoint; // at least one missing number below midpoint
} else {
min = midpoint; // no missing numbers below midpoint, must be above
}
}
return min;
}
One detail to note is that min + (max - min) / 2 is the safe way to calculate the average of min and max; it won't produce bogus results due to overflowing intermediate values like the seemingly simpler (min + max) / 2 might.
Also, even though it would be tempting to solve this problem using recursion, I chose an iterative solution instead for two reasons: first, because it (arguably) shows more clearly what's actually being done, and second, because the task was to minimize memory use, which presumably includes the stack too.
Finally, it would be easy to optimize this code, e.g. by returning as soon as count equals zero, by counting the numbers in both halves of the range in one pass and choosing the one with more missing numbers, or even by extending the binary search to n-ary search for some n > 2 to reduce the number of passes. However, to keep the example code as simple as possible, I've left such optimizations unmade. If you like, you may want to, say, try modifying the code so that it requires at most eight passes over the file instead of the current 32. (Hint: use a 16-element array.)
Actually, if we have range of integers from a to b. Sample: [a..b].
And in this range we have b-a integers. It means, that only one is missing.
And if only one is missing, we can calculate result using only single cycle.
First we can calculate sum of all integers in range [a..b], which equals:
sum = (a + b) * (b - a + 1) / 2
Then we calcualate summ of all integers in our sequence:
long sum1 = 0;
for (int i = 0; i < b - a; i++)
sum1 += arr[i];
Then we can find missing element as difference of those two sums:
long result = sum1 - sum;
when you've seen 2^31 zeros or ones in the ith digit place then your answer has a one or zero in the ith place. (Ex: 2^31 ones in 5th binary position means the answer has a zero in the 5th binary position.
First draft of c code:
uint32_t binaryHistogram[32], *list4BILLION, answer, placesChecked[32];
uint64_t limit = 4294967296;
uint32_t halfLimit = 4294967296/2;
int i, j, done
//General method to point to list since this detail is not important to the question.
list4BILLION = 0000000000h;
//Initialize array to zero. This array represents the number of 1s seen as you parse through the list
for(i=0;i<limit;i++)
{
binaryHistogram[i] = 0;
}
//Only sum up for first half of the 4 billion numbers
for(i=0;i<halfLimit;i++)
{
for(j=0;j<32;j++)
{
binaryHistogram[j] += ((*list4BILLION) >> j);
}
}
//Check each ith digit to see if all halfLimit values have been parsed
for(i=halfLimit;i<limit;i++)
{
for(j=0;j<32;j++)
{
done = 1; //Dont need to continue to the end if placesChecked are all
if(placesChecked[j] != 0) //Dont need to pass through the whole list
{
done = 0; //
binaryHistogram[j] += ((*list4BILLION) >> j);
if((binaryHistogram[j] > halfLimit)||(i - binaryHistogram[j] == halfLimit))
{
answer += (1 << j);
placesChecked[j] = 1;
}
}
}
}