Subsequence having sum at most 'k' - c++

Given a non decreasing array A of size n and an integer k, how to find a subsequence S of the array A with maximum possible sum of its elements, such that this sum is at most k. If there are multiple such subsequences, we are interested in finding only one.
For example, let the array be {1, 2, 2, 4} so, n = 4 and let k = 7. Then, the answer should be {1, 2, 4}.
Brute force approach takes approximately O(n(2^n-1)) but is there a more efficient solution to this problem?

In the general case the answer is no.
Just deciding if there is a solution where elements sum up to k is equivalent to the Subset Sum Problem and thus already NP-complete.
The Subset Sum Problem can be equivalently formulated as: given the integers or
natural numbers w_1,... ,w_n does any subset of them sum to precisely W
However, if either n or the number of bits P that it takes to represent the largest number w is small there might be more efficient solution (e.g., a pseudo-polynomial solution based on dynamic programming if P is small). Additionally, if all your numbers w are positive then it might also be possible to find a better solution.

Related

Count Divisors of Product from L to R

I have been solving a problem but then got stuck upon its subpart which is as follows:
Given an array of N elements whose ith element is A[i] and we are given Q queries of the type [L,R].
For each query output the number of divisors of product from Lth element to Rth element.
More formally, for each query lets define P as P = A[L] * A[L+1] * A[L+2] * ...* A[R].
Output the number of divisors of P modulo 998244353.
Constraints : 1<= N,Q <= 100000, 1<= A[i] <= 1000000.
My Approach,
For each index i, I have defined a map< int, int > which stores the prime divisor and its count in the product up to [1, i].
I am extracting the prime divisors of a number in O(LogN) using Sieve.
Then for each query (lets say {L,R} ), I am iterating through the map of Lth element and subtracting the count of each each key from the map of Rth element.
And then I am answering the query using the result:
if N = a^p * b^q * c^r ...(a,b,c being primes)
the number of divisors = (p+1)(q+1)(r+1)..
The time complexity of above solution is O(ND + QD), where D = number of distinct prime numbers upto 1000000. In worst case D = 78498.
Is there more efficient solution than this?
There is a more efficient solution for this. But it is slightly complicated. Here are steps to get to the necessary data structure.
Define a data type prime_factor that is a struct that contains a prime and a count.
Define a data type prime_factorization that is a vector of the first data type in ascending size of the primes. This can store the factorization of a number.
Write a function that takes a number, and turns its prime factorization into a prime_factorization
Write a function that takes 2 prime_factorization vectors and merges them into the factorization of the product of the two.
For each number in your array, compute its prime factorization. That gets stored in an array.
For each pair in your array, compute the prime factorization of the product. We will only need half of them. So elements 0, 1 go into one factorization, 2, 3 into the next and so on.
Repeat step 6 O(log(N)) times. So you have a vector of the factorization of each number, pairs, fours, eights, and so on. This results in approximately 2N precomputed factorization vectors. Most vectors are small though a few can be up to O(D) in size (where D is the number of distinct primes). Most of the merges should be very, very fast.
And now you have all of your data prepared. It can't take more than O(log(N)) times the space that storing the prime factors required by itself. (Less than that normally, though, because repeats among the small primes get gathered together in one prime_factor.)
Any range is the union of at most O(log(N)) of these computed vectors. For example the range 10..25 can be broken up into 10..11, 12..15, 16..24, 25. Arrange these intervals from smallest to largest and merge them. Then compute your answer from the result.
An exact analysis is complicated. But I assure you that query time is bounded above by O(Q * D * log(N)) and normally is much less than that.
UPDATE:
How do you find those intervals?
The answer is that you need to identify the number divisible by the highest power of 2 in the range, and then fill out both sides from there. And you figure that out by dividing by 2 (rounding down) until the range is of length 1. Then multiply the top boundary by 2 to find that mid-point.
For example if your range was 35-53 you would start by dividing by 2 to get 35-53, 17-26, 8-13, 4-6, 2-3. That was 2^4 we divided by. our power of 2 mid-point is 3*2^4 = 48. Our intervals above that midpoint are then 48-52, 53-53. Our intervals below are 40-47, 36-39, 35-35. And each of them is of length a power of 2 and starts at a number divisible by that power of 2.

Most equivalent factors of a number

Given a number 'n', which is a power-of-2, how can I efficiently find the 2 factors which are most equivalent to eachother? In other words, if I have a linear array and want to map it to 2D, how can I find the 2D dimensions that are the most equal (image dimensions most close to a square)?
Gotta be some kind of bitwise operation to make this fast, rather than looping over factors.
n is representable as 2^k (since you say it's a power of 2). If k is even, then n == 2^(k/2) * 2^(k/2) (e.g. 16==4*4). If k is odd, then the closest you can get is n == 2^((k-1)/2) * 2^((k+1)/2) (e.g. 8==2*4)

Maximum number not coprime to V

Given a fixed array A of N integers where N<=100,000 and all elements of array are also less than or equal to 100,000. The numbers in A are not monotonically increasing or contiguous or otherwise conveniently organized.
Now I am given up to 100,000 queries of the form {V, L, R} where in each query I need to find the largest number A[i] with i in the range [L,R] that is not coprime with the given value V. (That is GCD(V,A[i]) is not equal to 1.)
If it's is not possible, then also tell that all numbers in the given range are coprime to V.
A basic approach would be to iterate from each A[i] between L and R and compute GCD with value V and hence find maximum. But is there any better way to do it if the number of queries can be up to 100,000 too. In that case, it's too inefficient to check for each number each time.
Example:
Let us have N=6 and the array be [1,2,3,4,5,4] and let V be 2 and range [L,R] is [2,5].
Then the answer is 4.
Explanation:
GCD(2,2)=2
GCD(2,3)=1
GCD(2,4)=2
GCD(2,5)=1
So maximum is 4 here.
Since you have a large array but only one V, it should be faster to start by factorizing V. After that your coprime test becomes simply finding the remainder modulo each unique factor of V.
Daniel Bernstein's "Factoring into coprimes in essentially linear time" (Journal of Algorithms 54:1, 1-30 (2005)) answers a similar question, and is used to identify bad (repeat factor) RSA moduli by Nadia Heninger's "New research: There's No Need to Panic Over Factorable Keys--Just Mind Your Ps and Qs"`. The problem there is to find common factors between a huge set of very large numbers, without going a pair at a time.
Lets say that
V = p_1*...*p_n
where p_i is a prime number (you can restrict it to distinct primes only). Now the answer is
result = -1
for p_i:
res = floor(R / p_i) * p_i
if res >= L and res > result:
result = res
So if you can factorize V fast then this will be quite efficient.
EDIT I didn't notice that the array does not have to contain all integers. In that case sieve it, i.e. given prime numbers p_1, ..., p_n create a "reversed" sieve (i.e. all multiples of primes in range [L, R]). Then you can just do an intersection of that sieve with your initial array.
EDIT2 To generate the set of all multiples you can use this algorithm:
primes = [p_1, ..., p_n]
multiples = []
for p in primes:
lower = floor(L / p)
upper = floor(R / p)
for i in [lower+1, upper]:
multiples.append(i*p)
The imporatant thing is that it follows from math that V is coprime with every number in range [L, R] which is not in multiples. Now you simply do:
solution = -1
for no in initial_array:
if no in multiples:
solution = max(solution, no)
Note that if you implement result as a set, then if no in result: check is O(1).
EXAMPLE Let's say that V = 6 = 2*3 and initial_array = [7,11,12,17,21] and L=10 and R=22. Let's start with multiples. Following the algorithm we obtain that
multiples = [10, 12, 14, 16, 18, 20, 22, 12, 15, 18, 21]
First 7 are multiples of 2 (in range [10, 22]) and last 4 are multiples of 3 (in range [10, 22]). Since we are dealing with sets (std::set?) then there will be no duplicates (12 and 18):
multiples = [10, 12, 14, 16, 18, 20, 22, 15, 21]
Now go through the initial_array and check what values are in multiples. We obtain that the biggest such number is 21. And indeed 21 is not coprime with 6.
Factor each of A's elements and store, for each possible prime factor, a sorted list of the numbers that contain this factor.
Given a number n contains O(log n) prime factors, this list will use O(N log N) memory.
Then, for each query (V, L, R), search for each prime factor in V, what is the maximum number that contain that factor within [L, R] (this can be done with a simple binary search).

USACO: Subsets (Inefficient)

I am trying to solve subsets from the USACO training gateway...
Problem Statement
For many sets of consecutive integers from 1 through N (1 <= N <= 39), one can partition the set into two sets whose sums are identical.
For example, if N=3, one can partition the set {1, 2, 3} in one way so that the sums of both subsets are identical:
{3} and {1,2}
This counts as a single partitioning (i.e., reversing the order counts as the same partitioning and thus does not increase the count of partitions).
If N=7, there are four ways to partition the set {1, 2, 3, ... 7} so that each partition has the same sum:
{1,6,7} and {2,3,4,5}
{2,5,7} and {1,3,4,6}
{3,4,7} and {1,2,5,6}
{1,2,4,7} and {3,5,6}
Given N, your program should print the number of ways a set containing the integers from 1 through N can be partitioned into two sets whose sums are identical. Print 0 if there are no such ways.
Your program must calculate the answer, not look it up from a table.
End
Before I was running on a O(N*2^N) by simply permuting through the set and finding the sums.
Finding out how horribly inefficient that was, I moved on to mapping the sum sequences...
http://en.wikipedia.org/wiki/Composition_(number_theory)
After many coding problems to scrape out repetitions, still too slow, so I am back to square one :(.
Now that I look more closely at the problem, it looks like I should try to find a way to not find the sums, but actually go directly to the number of sums via some kind of formula.
If anyone can give me pointers on how to solve this problem, I'm all ears. I program in java, C++ and python.
Actually, there is a better and simpler solution. You should use Dynamic Programming
instead. In your code, you would have an array of integers (whose size is the sum), where each value at index i represents the number of ways to possibly partition the numbers so that one of the partitions has a sum of i. Here is what your code could look like in C++:
int values[N];
int dp[sum+1]; //sum is the sum of the consecutive integers
int solve(){
if(sum%2==1)
return 0;
dp[0]=1;
for(int i=0; i<N; i++){
int val = values[i]; //values contains the consecutive integers
for(int j=sum-val; j>=0; j--){
dp[j+val]+=dp[j];
}
}
return dp[sum/2]/2;
}
This gives you an O(N^3) solution, which is by far fast enough for this problem.
I haven't tested this code, so there might be a syntax error or something, but you get the point. Let me know if you have any more questions.
This is the same thing as finding the coefficient x^0 term in the polynomial (x^1+1/x)(x^2+1/x^2)...(x^n+1/x^n), which should take about an upper bound of O(n^3).

Find a prime number?

To find whether N is a prime number we only need to look for all numbers less or equal to sqrt(N). Why is that? I am writing a C code so trying to understand a reason behind it.
N is prime if it is a positive integer which is divisible by exactly two positive integers, 1 and N. Since a number's divisors cannot be larger than that number, this gives rise to a simple primality test:
If an integer N, greater than 1, is not divisible by any integer in the range [2, N-1], then N is prime. Otherwise, N is not prime.
However, it would be nice to modify this test to make it faster. So let us investigate.
Note that the divisors of N occur in pairs. If N is divisible by a number M, then it is also divisible by N/M. For instance, 12 is divisble by 6, and so also by 2. Furthermore, if M >= sqrt(N), then N/M <= sqrt(N).
This means that if no numbers less than or equal to sqrt(N) divide N, no numbers greater than sqrt(N) divide N either (excepting 1 and N themselves), otherwise a contradiction would arise.
So we have a better test:
If an integer N, greater than 1, is not divisible by any integer in the range [2, sqrt(N)], then N is prime. Otherwise, N is not prime.
if you consider the reasoning above, you should see that a number which passes this test also passes the first test, and a number which fails this test also fails the first test. The tests are therefore equivalent.
A composite number (one that is not prime, or 1) has at least 1 pair of factors, and it is guaranteed that one of the numbers from each pair is less than or equal to the square root of the number (which is what you are asking about).
If you square the square root of the number, you get the number itself (sqrt(n) * sqrt(n) = n), so if you made one of the numbers bigger (than sqrt(n)) you would have to make the other one smaller. If you then only check the numbers 2 through sqrt(n) you will have checked all of the possible factors, since each of those factors will be paired with a number that is greater than sqrt(n) (except of course if the number is in fact a square of some other number, like 4, 9, 16, etc...but that doesn't matter since you know they aren't prime; they are easily factored by sqrt(n) itself).
The reason is simple, any number bigger than the sqrt, will cause the other multiplier, to be smaller than the sqrt. In such case, you should have already check it.
Let n=a×b be composite.
Assume a>sqrt(n) and b>sqrt(n).
a×b > sqrt(n)×sqrt(n)
a×b > n
But we know a×b=n, therefore a<sqrt(n) or b<sqrt(n).
Since you only need to know a or b to show n is composite, you only need to check the numbers up to sqrt(n) to find such a number.
Because in the worst case, number n can be expresed as a2.
If the number can be expressed diferently, that men that one of divisors will be less than a = sqrt(n), but the other can be greater.