Algorithm about number theory - c++

Give two positive integer a,b (1<=a<=30, 1<=b<=10000000) ,and define two unrepeatable set L and R,
L = {x * y | 1 <= x <= a, 1 <= y <= b, x,y is integer}
R = {x ^ y | 1 <= x <= a, 1 <= y <= b, x,y is integer},
^ is XOR operate
For any two integer: A∈L, B∈R, we format B to n+1(n is the decimal digit number of b) decimal digit(fill 0 in front of B),and then joint B to the end of A and get a new integer AB.
Compute the sum of all generated integer AB (In case the sum exceed, just return "sum mod 1000000007", mod means modular operation)
Note: the time of your algorithm is no more than 3 seconds
My algorithm is very simple:we can easily get the max number in set R, and the element in R is 0,1,2,3...maxXor,(the element max(a,b) may be not in R), using a hash table the compute set L. But the algorithm consume 4 seconds when a = 30, b = 100000.
Give an example:
a = 2, b = 4, so
L = {1 * 1, 1 * 2, 1 * 3, 1 * 4, 2 * 1, 2 * 2, 2 * 3, 2 * 4} = {1, 2, 3, 4, 6, 8}
R = {1^1,1^2,1^3,1^4,2^1,2^2,2^3,2^4} = {0, 1, 2, 3, 5, 6}
All generated integer AB is:
{
100, 101, 102, 103, 105, 106,
200, 201, 202, 203, 205, 206,
300, 301, 302, 303, 305, 306,
400, 401, 402, 403, 405, 406,
600, 601, 602, 603, 605, 606,
800, 801, 802, 803, 805, 806
}
The sum of all AB is 14502

So the number AB can be written as 10^(n+1) A + B. Which means that summing over all A, B, the total is equal to
|R| 10^(n+1) Sum(A in L) + |L| Sum(B in R)
In your example,
|L| = 6
|R| = 6
Sum(A in L) = 24
Sum(B in R) = 17
n = 3
which when plugged into the above formula gives 14,502.
This reduces the runtime in the size of the sets from quadratic to linear, so you should see quite a huge improvement.
The next bits I haven't fleshed out fully because I don't have the time to, but they feel like they should work:
First, notice that Sum(A in L) would be trivial to calculate using
1 + 2 + .. + n = n(n-1)/2
if there wasn't the constraint that L doesn't contain repeats. You can get around this though by exploiting the fact that a is very small: iteratively calculate the sums 1, .., a using the triangular number formula and use that information to avoid counting a product more than once.
For Sum(B in R), notice that when you compare y and x^y, at most the first lg(a) bits have changed. So you can split a sum of x^ys into two sums: one which deals with the bits from lg(a)+1 upwards and which depends only on b, and a second, more complex sum which deals with the bits from lg(a) downwards and which depends on a and b.
Edit: The OP's asked me to expand on how to quickly compute Sum(A in L). There was a lottt of stuff in this section in previous edits, but I've actually sat down and worked through it now rather than haphazardly batting it around in my head. It also turned out to be more complicated than I expected, so my apologies for not sitting down and working through it sooner #tenos.
So what we want to do is take the sum of all distinct products x*y such that 1 <= x <= a and 1 <= y <= b. Well, that turns out to be pretty hard so let's start with a simpler problem: given two integers x1, x2 with x1 < x2, how can we compute the sum of all distinct products x1*y or x2*y where 1 <= y <= b?
If we dropped the distinctness criterion, this'd be easy: it'd simply be
x1*Sum(b) + x2*Sum(b)
where Sum(j) denotes the sum of integers 1 through j inclusive, and can be calculated using Gauss's formula for the triangular numbers. So again we can reduce the problem into something simpler: how can we find the sum of all products that appear in both the left and right terms?
Well, two products are equal if
x1*y1 == x2*y2
This happens exactly when x1*y1 == x2*y2 == k*LCM(x1, x2), where LCM is the lowest common multiple and k is some integer.
The sum of this over all k such that 1 <= k*LCM(x1, x2) <= x1*b is
R(x1, x2) = LCM(x1, x2) * Sum(x1*b/LCM(x1, x2))
where R stands for "repeats". Which means that our sum of all distinct products x1*y or x2*y where 1 <= y <= b is
x1*Sum(b) + x2*Sum(b) - R(x1, x2)
Next, let's extend the definition of R to be defined on three variables x1 < x2 < x3 as
R(x1, x2, x3) = LCM(x1, x2, x3) * Sum(x1*b/LCM(x1, x2, x3))
and similarly for 4 variables, 5 variables, etc. Then the sum of distinct products for three x1 < x2 < x3 is
x1*Sum(b) + x2*Sum(b) + x3*Sum(b) - R(x1, x2) - R(x1, x3) - R(x2, x3) + R(x1, x2, x3)
by the inclusion-exclusion principle.
So, let's make use of this. Define
Sum for x = 1: 1*Sum(b)
Sum for x = 2: 2*Sum(b) - R(2, 1)
Sum for x = 3: 3*Sum(b) - R(3, 2) - R(3, 1) + R(3, 2, 1)
Etc. Then the sum of all these sums up to x = a is the sum of all distinct products.
Edit: #tenos turned this into a useful solution. He noticed that since i*Sum(b) contains many repeats, we can replace by i*sum(k...b), k = max(b/minPrimeFactor(i) + 1, i).
Further, when using inclusion-exclusion principle, many unnecessary computations can be pruned. For instance, if R(1,2) = NULL, there is no need to compute R(1,2,3), R(1,2,4).., etc. In fact, when b is very big, there are many R(i,..j) = NULL.

Related

How can I write this algorithm that returns the count between x and y in a list?

I am given this algorithmic problem, and need to find a way to return the count in a list S and another list L that is between some variable x and some variable y, inclusive, that runs in O(1) time:
I've issued a challenge against Jack. He will submit a list of his favorite years (from 0 to 2020). If Jack really likes a year,
he may list it multiple times. Since Jack comes up with this list on the fly, it is in no
particular order. Specifically, the list is not sorted, nor do years that appear in the list
multiple times appear next to each other in the list.
I will also submit such a list of years.
I then will ask Jack to pick a random year between 0 and 2020. Suppose Jack picks the year x.
At the same time, I will also then pick a random year between 0 and 2020. Suppose I
pick the year y. Without loss of generality, suppose that x ≤ y.
Once x and y are picked, Jack and I get a very short amount of time (perhaps 5
seconds) to decide if we want to re-do the process of selecting x and y.
If no one asks for a re-do, then we count the number of entries in Jack's list that are
between x and y inclusively and the number of entries in my list that are between x and
y inclusively.
More technically, here is the situation. You are given lists S and L of m and n integers,
respectively, in the range [0, k], representing the collections of years selected by Jack and
I. You may preprocess S and L in O(m+n+k) time. You must then give an algorithm
that runs in O(1) time – so that I can decide if I need to ask for a re-do – that solves the
following problem:
Input: Two integers, x as a member of [0,k] and y as a member of [0,k]
Output: the number of entries in S in the range [x, y], and the number of entries in L in [x, y].
For example, suppose S = {3, 1, 9, 2, 2, 3, 4}. Given x = 2 and y = 3, the returned count
would be 4.
I would prefer pseudocode; it helps me understand the problem a bit easier.
Implementing the approach of user3386109 taking care of edge case of x = 0.
user3386109 : Make a histogram, and then compute the accumulated sum for each entry in the histogram. Suppose S={3,1,9,2,2,3,4} and k is 9. The histogram is H={0,1,2,2,1,0,0,0,0,1}. After accumulating, H={0,1,3,5,6,6,6,6,6,7}. Given x=2 and y=3, the count is H[y] - H[x-1] = H[3] - H[1] = 5 - 1 = 4. Of course, x=0 is a corner case that has to be handled.
# INPUT
S = [3, 1, 9, 2, 2, 3, 4]
L = [2, 9, 4, 6, 8, 5, 3]
k = 9
x = 2
y = 3
# Histogram for S
S_hist = [0]*(k+1)
for element in S:
S_hist[element] = S_hist[element] + 1
# Storing prefix sum in S_hist
sum = S_hist[0]
for index in range(1,k+1):
sum = sum + S_hist[index]
S_hist[index] = sum
# Similar approach for L
# Histogram for L
L_hist = [0] * (k+1)
for element in L:
L_hist[element] = L_hist[element] + 1
# Stroing prefix sum in L_hist
sum = L_hist[0]
for index in range(1,k+1):
sum = sum + L_hist[index]
L_hist[index] = sum
# Finding number of elements between x and y (inclusive) in S
print("number of elements between x and y (inclusive) in S:")
if(x == 0):
print(S_hist[y])
else:
print(S_hist[y] - S_hist[x-1])
# Finding number of elements between x and y (inclusive) in S
print("number of elements between x and y (inclusive) in L:")
if(x == 0):
print(L_hist[y])
else:
print(L_hist[y] - L_hist[x-1])

MaxDoubleSliceSum - codility

I'm trying to solve MexDoubleSliceSum problem without Kandane's bidirectional algorithm.
Problem Definition:
A non-empty array A consisting of N integers is given.
A triplet (X, Y, Z), such that 0 ≤ X < Y < Z < N, is called a double
slice.
The sum of double slice (X, Y, Z) is the total of A[X + 1] + A[X + 2]
+ ... + A[Y − 1] + A[Y + 1] + A[Y + 2] + ... + A[Z − 1].
For example, array A such that:
A[0] = 3
A[1] = 2
A[2] = 6
A[3] = -1
A[4] = 4
A[5] = 5
A[6] = -1
A[7] = 2
The goal is to find the maximal sum of any double slice.
that, given a non-empty array A consisting of N integers, returns the
maximal sum of any double slice.
For example, given:
A[0] = 3
A[1] = 2
A[2] = 6
A[3] = -1
A[4] = 4
A[5] = 5
A[6] = -1
A[7] = 2
the function should return 17, because no double slice of array A has
a sum of greater than 17.
I have figured out following idea:
I'm taking a slice and putting lever (value in the middle that's being dropped) to lowest value included in this slice. If I notice that next value is lowering total sum i'm changing lever to it and reducing sum with values before last lever(including old lever).
int solution(vector<int> &A) {
if(A.size()<4)
return 0;
int lever=A[1];
int sum=-lever;
int presliceValue=0;
int maxVal=A[1];
for(int i=1;i<A.size()-1;i++){
if(sum+A[i]<sum || A[i]<lever){
sum+=lever;
if(presliceValue<0)
sum=sum-presliceValue;
lever=A[i];
presliceValue=sum+lever;
}
else
sum=sum+A[i];
if(sum>maxVal)
maxVal=sum;
}
return maxVal;
}
This solution returns wrong value on few test cases (unfortunately cannot tell what's tested values):
unfortunately i cannot reproduce following error and codility does not share test values.
Failed Test cases
many the same small sequences, length = ~100,000
large random: random, length = ~100,000
random, numbers from -30 to 30, length = 300
random, numbers form -104 to 104, length = 70

Can you get a list of the powers in a polynomial? Pari GP

I'm working with single variable polynomials with coefficients +1/-1 (and zero). These can be very long and the range of powers can be quite big. It would be convenient for me to view the powers as a vector - is there any way of doing this quickly? I had hoped there would be a command already in Pari to do this, but I can't seem to see one?
Just an example to confirm what I'm trying to do...
Input:x^10 - x^8 + x^5 - x^2 + x + 1
Desired output: [10, 8, 5, 2, 1, 0]
You can use Vecrev to get the polynomial coefficients. After that just enumerate them to select the zero-based positions of non-zeros. You want the following one-liner:
nonzeros(xs) = Vecrev([x[2]-1 | x <- select(x -> x[1] != 0, vector(#xs, i, [xs[i], i]))])
Now you can easily get the list of polynomial powers:
p = x^10 - x^8 + x^5 - x^2 + x + 1
nonzeros(Vecrev(p))
>> [10, 8, 5, 2, 1, 0]

All possible combinations of coins

I need to write a program which displays all possible change combinations given an array of denominations [1 , 2, 5, 10, 20, 50, 100, 200] // 1 = 1 cent
Value to make the change from = 300
I'm basing my code on the solution from this site http://www.geeksforgeeks.org/dynamic-programming-set-7-coin-change/
#include<stdio.h>
int count( int S[], int m, int n )
{
int i, j, x, y;
// We need n+1 rows as the table is consturcted in bottom up manner using
// the base case 0 value case (n = 0)
int table[n+1][m];
// Fill the enteries for 0 value case (n = 0)
for (i=0; i<m; i++)
table[0][i] = 1;
// Fill rest of the table enteries in bottom up manner
for (i = 1; i < n+1; i++)
{
for (j = 0; j < m; j++)
{
// Count of solutions including S[j]
x = (i-S[j] >= 0)? table[i - S[j]][j]: 0;
// Count of solutions excluding S[j]
y = (j >= 1)? table[i][j-1]: 0;
// total count
table[i][j] = x + y;
}
}
return table[n][m-1];
}
// Driver program to test above function
int main()
{
int arr[] = {1, 2, 5, 10, 20, 50, 100, 200}; //coins array
int m = sizeof(arr)/sizeof(arr[0]);
int n = 300; //value to make change from
printf(" %d ", count(arr, m, n));
return 0;
}
The program runs fine. It displays the number of all possible combinations, but I need it to be more advanced. The way I need it to work is to display the result in following fashion:
1 cent: n number of possible combinations.
2 cents:
5 cents:
and so on...
How can I modify the code to achieve that ?
Greedy Algorithm Approach
Have this denominations in an int array say, int den[] = [1 , 2, 5, 10, 20, 50, 100, 200]
Iterate over this array
For each iteration do the following
Take the element in the denominations array
Divide the change to be allotted number by the element in denominations array number
If the change allotted number is perfectly divisible by the number in denomination array then you are done with the change for that number.
If the number is not perfectly divisible then check for the remainder and do the same iteration with other number
Exit the inner iteration once you get the value equal to the change number
Do the same for the next denomination available in our denomination array.
Explained with example
den = [1 , 2, 5, 10, 20, 50, 100, 200]
Change to be alloted : 270, let take this as x
and y be the temporary variable
Change map z[coin denomination, count of coins]
int y, z[];
First iteration :
den = 1
x = 270
y = 270/1;
if x is equal to y*den
then z[den, y] // z[1, 270]
Iteration completed
Second Iteration:
den = 2
x = 270
y = 270/2;
if x is equal to y*den
then z[den , y] // [2, 135]
Iteration completed
Lets take a odd number
x = 217 and den = 20
y= 217/20;
now x is not equal to y*den
then update z[den, y] // [20, 10]
find new x = x - den*y = 17
x=17 and identify the next change value by greedy it would be 10
den = 10
y = 17/10
now x is not equal to y*den
then update z[den, y] // [10, 1]
find new x = x - den*y = 7
then do the same and your map would be having following entries
[20, 10]
[10, 1]
[5, 1]
[2, 1]

Linear index upper triangular matrix

If I have the upper triangular portion of a matrix, offset above the diagonal, stored as a linear array, how can the (i,j) indices of a matrix element be extracted from the linear index of the array?
For example, the linear array [a0, a1, a2, a3, a4, a5, a6, a7, a8, a9 is storage for the matrix
0 a0 a1 a2 a3
0 0 a4 a5 a6
0 0 0 a7 a8
0 0 0 0 a9
0 0 0 0 0
And we want to know the (i,j) index in the array corresponding to an offset in the linear matrix, without recursion.
A suitable result, k2ij(int k, int n) -> (int, int) would satisfy, for example
k2ij(k=0, n=5) = (0, 1)
k2ij(k=1, n=5) = (0, 2)
k2ij(k=2, n=5) = (0, 3)
k2ij(k=3, n=5) = (0, 4)
k2ij(k=4, n=5) = (1, 2)
k2ij(k=5, n=5) = (1, 3)
[etc]
The equations going from linear index to (i,j) index are
i = n - 2 - floor(sqrt(-8*k + 4*n*(n-1)-7)/2.0 - 0.5)
j = k + i + 1 - n*(n-1)/2 + (n-i)*((n-i)-1)/2
The inverse operation, from (i,j) index to linear index is
k = (n*(n-1)/2) - (n-i)*((n-i)-1)/2 + j - i - 1
Verify in Python with:
from numpy import triu_indices, sqrt
n = 10
for k in range(n*(n-1)/2):
i = n - 2 - int(sqrt(-8*k + 4*n*(n-1)-7)/2.0 - 0.5)
j = k + i + 1 - n*(n-1)/2 + (n-i)*((n-i)-1)/2
assert np.triu_indices(n, k=1)[0][k] == i
assert np.triu_indices(n, k=1)[1][k] == j
for i in range(n):
for j in range(i+1, n):
k = (n*(n-1)/2) - (n-i)*((n-i)-1)/2 + j - i - 1
assert triu_indices(n, k=1)[0][k] == i
assert triu_indices(n, k=1)[1][k] == j
First, let's renumber a[k] in opposite order. We'll get:
0 a9 a8 a7 a6
0 0 a5 a4 a3
0 0 0 a2 a1
0 0 0 0 a0
0 0 0 0 0
Then k2ij(k, n) will become k2ij(n - k, n).
Now, the question is, how to calculate k2ij(k, n) in this new matrix. The sequence 0, 2, 5, 9 (indices of diagonal elements) corresponds to triangular numbers (after subtracting 1): a[n - i, n + 1 - i] = Ti - 1. Ti = i * (i + 1)/2, so if we know Ti, it's easy to solve this equation and get i (see formula in the linked wiki article, section "Triangular roots and tests for triangular numbers"). If k + 1 is not exactly a triangular number, the formula will still give you the useful result: after rounding it down, you'll get the highest value of i, for which Ti <= k, this value of i corresponds to the row index (counting from bottom), in which a[k] is located. To get the column (counting from right), you should simply calculate the value of Ti and subtract it: j = k + 1 - Ti. To be clear, these are not exacly i and j from your problem, you need to "flip" them.
I didn't write the exact formula, but I hope that you got the idea, and it will now be trivial to find it after performing some boring but simple calculations.
The following is an implimentation in matlab, which can be easily transferred to another language, like C++. Here, we suppose the matrix has size m*m, ind is the index in the linear array. The only thing different is that here, we count the lower triangular part of the matrix column by column, which is analogus to your case (counting the upper triangular part row by row).
function z= ind2lTra (ind, m)
rvLinear = (m*(m-1))/2-ind;
k = floor( (sqrt(1+8*rvLinear)-1)/2 );
j= rvLinear - k*(k+1)/2;
z=[m-j, m-(k+1)];
For the records, this is the same function, but with one-based indexing, and in Julia:
function iuppert(k::Integer,n::Integer)
i = n - 1 - floor(Int,sqrt(-8*k + 4*n*(n-1) + 1)/2 - 0.5)
j = k + i + ( (n-i+1)*(n-i) - n*(n-1) )÷2
return i, j
end
Here is a more efficient formulation for k:
k = (2 * n - 3 - i) * i / 2 + j - 1
In python 2:
def k2ij(k, n):
rows = 0
for t, cols in enumerate(xrange(n - 1, -1, -1)):
rows += cols
if k in xrange(rows):
return (t, n - (rows - k))
return None
In python, the most efficient way is:
array_size= 3
# make indices using k argument if you want above the diagonal
u, v = np.triu_indices(n=array_size,k=1)
# assuming linear indices above the diagonal i.e. 0 means (0,1) and not (0,0)
linear_indices = [0,1]
ijs = [(i,j) for (i,j) in zip(u[linear_indices], v[linear_indices])]
ijs
#[(0, 1), (0, 2)]