I want to find the non consecutive subsequences of a string divisible by a number k (say k = 3). One can call it a modification to the problem https://www.hackerrank.com/contests/w6/challenges/consecutive-subsequences/
For example, Input:
A = {1,2,3,4,1} k = 3
Output:
9
9 because 12,24,21,141,123,231,1231 etc. are possible
What I did for continuous subsequences was
long long get_count(const vector<int> & vec, int k) {
vector<int> cnt_mod(k, 0);
cnt_mod[0] = 1;
int pref_sum = 0;
for (int elem : vec) {
pref_sum += elem;
pref_sum %= k;
cnt_mod[pref_sum]++;
}
long long res = 0;
for (int mod = 0; mod < k; mod++)
res += (long long)cnt_mod[mod] * (cnt_mod[mod] - 1) / 2;
return res;
}
Can you please provide a suitable modification or a new approach(or code) to this to accomplish the required goal?
Thank You :)
Let DP[i][j] : the number of subsequences which form j as modulus when divided by a number .
You will need to know some Modular Arithmetic as pre requisite.
The recurrence is simple afterwards :
This is a small piece of code specifically for 3.
DP[0][(str[0]-'0')%3]=1;
for(i=1;str[i];i++)
{
DP[i][(str[i]-'0')%3]++;
for(j=0;j<=2;j++) // A Modulo B is always smaller than B
{
DP[i][j] += DP[i-1][j];
if(DP[i-1][j])
DP[i][(j*10+str[i]-'0')%3]+=DP[i-1][j];
}
}
First is the case when we skip the i th letter , and second case forms a sequence which gives modulo (j*10+str[i]-'0')%3 when i th letter is used.
We can drop the if statement
Related
I should implement this summation in C ++. I have tried with this code, but with very high numbers up to 10 ^ 12 it takes too long.
The summation is:
For any positive integer k, let d(k) denote the number of positive divisors of k (including 1 and k itself).
For example, for the number 4: 1 has 1 divisor, 2 has two divisors, 3 has two divisors, and 4 has three divisors. So the result would be 8.
This is my code:
#include <iostream>
#include <algorithm>
using namespace std;
int findDivisors(long long n)
{
int c=0;
for(int j=1;j*j<=n;j++)
{
if(n%j==0)
{
c++;
if(j!=(n/j))
{
c++;
}
}
}
return c;
}
long long compute(long long n)
{
long long sum=0;
for(int i=1; i<=n; i++)
{
sum += (findDivisors(i));
}
return sum;
}
int main()
{
int n, divisors;
freopen("input.txt", "r", stdin);
freopen("output.txt", "w", stdout);
cin >> n;
cout << compute(n);
}
I think it's not just a simple optimization problem, but maybe I should change the algorithm entirely.
Would anyone have any ideas to speed it up? Thank you.
largest_prime_is_463035818's answer shows an O(N) solution, but the OP is trying to solve this problem
with very high numbers up to 1012.
The following is an O(N1/2) algorithm, based on some observations about the sum
n/1 + n/2 + n/3 + ... + n/n
In particular, we can count the number of terms with a specific value.
Consider all the terms n/k where k > n/2. There are n/2 of those and all are equal to 1 (integer division), so that their sum is n/2.
Similar considerations hold for the other dividends, so that we can write the following function
long long count_divisors(long long n)
{
auto sum{ n };
for (auto i{ 1ll }, k_old{ n }, k{ n }; i < k ; ++i, k_old = k)
{ // ^^^^^ it goes up to sqrt(n)
k = n / (i + 1);
sum += (k_old - k) * i;
if (i == k)
break;
sum += k;
}
return sum;
}
Here it is tested against the O(N) algorithm, the only difference in the results beeing the corner cases n = 0 and n = 1.
Edit
Thanks again to largest_prime_is_463035818, who linked the Wikipedia page about the divisor summatory function, where both an O(N) and an O(sqrt(N)) algorithm are mentioned.
An implementation of the latter may look like this
auto divisor_summatory(long long n)
{
auto sum{ 0ll };
auto k{ 1ll };
for ( ; k <= n / k; ++k )
{
sum += n / k;
}
--k;
return 2 * sum - k * k;
}
They also add this statement:
Finding a closed form for this summed expression seems to be beyond the techniques available, but it is possible to give approximations. The leading behavior of the series is given by
D(x) = xlogx + x(2γ - 1) + Δ(x)
where γ is the Euler–Mascheroni constant, and the error term is Δ(x) = O(sqrt(x)).
I used your brute force approach as reference to have test cases. The ones I used are
compute(12) == 35
cpmpute(100) == 482
Don't get confused by computing factorizations. There are some tricks one can play when factorizing numbers, but you actually don't need any of that. The solution is a plain simple O(N) loop:
#include <iostream>
#include <limits>
long long compute(long long n){
long long sum = n+1;
for (long long i=2; i < n ; ++i){
sum += n/i;
}
return sum;
}
int main()
{
std::cout << compute(12) << "\n";
std::cout << compute(100) << "\n";
}
Output:
35
482
Why does this work?
The key is in Marc Glisse's comment:
As often with this kind of problem, this sum actually counts pairs x,
y where x divides y, and the sum is arranged to count first all x
corresponding to a fixed y, but nothing says you have to keep it that
way.
I could stop here, because the comment already explains it all. Though, if it didn't click yet...
The trick is to realize that it is much simpler to count divisors of all numbers up to n rather than n-times counting divisors of individual numbers and take the sum.
You don't need to care about factorizations of eg 123123123 or 52323423 to count all divisors up to 10000000000. All you need is a change of perspective. Instead of trying to factorize numbers, consider the divisors. How often does the divisor 1 appear up to n? Simple: n-times. How often does the divisor 2 appear? Still simple: n/2 times, because every second number is divisible by 2. Divisor 3? Every 3rd number is divisible by 3. I hope you can see the pattern already.
You could even reduce the loop to only loop till n/2, because bigger numbers obviously appear only once as divisor. Though I didn't bother to go further, because the biggest change is from your O(N * sqrt(N)) to O(N).
Let's start off with some math and reduce the O(n * sq(n)) factorization to O(n * log(log(n))) and for counting the sum of divisors the overall complexity is O(n * log(log(n)) + n * n^(1/3)).
For instance:
In Codeforces himanshujaju explains how we can optimize the solution of finding divisors of a number.
I am simplifying it a little bit.
Let, n as the product of three numbers p, q, and r.
so assume p * q * r = n, where p <= q <= r.
The maximum value of p = n^(1/3).
Now we can loop over all prime numbers in a range [2, n^(1/3)]
and try to reduce the time complexity of prime factorization.
We will split our number n into two numbers x and y => x * y = n.
And x contains prime factors up to n^(1/3) and y deals with higher prime factors greater than n^(1/3).
Thus gcd(x, y) = 1.
Now define F(n) as the number of prime factors of n.
From multiplicative rules, we can say that
F(x * y) = F(x) * F(y), if gcd(x, y) = 1.
For finding F(n) => F(x * y) = F(x) * F(y)
So first find F(x) then F(y) will F(n/x)
And there will 3 cases to cover for y:
1. y is a prime number: F(y) = 2.
2. y is the square of a prime number: F(y) = 3.
3. y is a product of two distinct prime numbers: F(y) = 4.
So once we are done with finding F(x) and F(y), we are also done with finding F(x * y) or F(n).
In Cp-Algorithm there is also a nice explanation of how to count the number of divisors on a number. And also in GeeksForGeeks a nice coding example of how to count the number of divisors of a number in an efficient way. One can check the articles and can generate a nice solution to this problem.
C++ implementation
#include <bits/stdc++.h>
using namespace std;
const int maxn = 1e6 + 11;
bool prime[maxn];
bool primesquare[maxn];
int table[maxn]; // for storing primes
void SieveOfEratosthenes()
{
for(int i = 2; i < maxn; i++){
prime[i] = true;
}
for(int i = 0; i < maxn; i++){
primesquare[i] = false;
}
// 1 is not a prime number
prime[1] = false;
for(int p = 2; p * p < maxn; p++){
// If prime[p] is not changed, then
// it is a prime
if(prime[p] == true){
// Update all multiples of p
for(int i = p * 2; i < maxn; i += p){
prime[i] = false;
}
}
}
int j = 0;
for(int p = 2; p < maxn; p++) {
if (prime[p]) {
// Storing primes in an array
table[j] = p;
// Update value in primesquare[p * p],
// if p is prime.
if(p < maxn / p) primesquare[p * p] = true;
j++;
}
}
}
// Function to count divisors
int countDivisors(int n)
{
// If number is 1, then it will have only 1
// as a factor. So, total factors will be 1.
if (n == 1)
return 1;
// ans will contain total number of distinct
// divisors
int ans = 1;
// Loop for counting factors of n
for(int i = 0;; i++){
// table[i] is not less than cube root n
if(table[i] * table[i] * table[i] > n)
break;
// Calculating power of table[i] in n.
int cnt = 1; // cnt is power of prime table[i] in n.
while (n % table[i] == 0){ // if table[i] is a factor of n
n = n / table[i];
cnt = cnt + 1; // incrementing power
}
// Calculating the number of divisors
// If n = a^p * b^q then total divisors of n
// are (p+1)*(q+1)
ans = ans * cnt;
}
// if table[i] is greater than cube root of n
// First case
if (prime[n])
ans = ans * 2;
// Second case
else if (primesquare[n])
ans = ans * 3;
// Third case
else if (n != 1)
ans = ans * 4;
return ans; // Total divisors
}
int main()
{
SieveOfEratosthenes();
int sum = 0;
int n = 5;
for(int i = 1; i <= n; i++){
sum += countDivisors(i);
}
cout << sum << endl;
return 0;
}
Output
n = 4 => 8
n = 5 => 10
Complexity
Time complexity: O(n * log(log(n)) + n * n^(1/3))
Space complexity: O(n)
Thanks, #largest_prime_is_463035818 for pointing out my mistake.
Problem statement: Given a set of n coins of some denominations (maybe repeating, in random order), and a number k. A game is being played by a single player in the following manner: Player can choose to pick 0 to k coins contiguously but will have to leave one next coin from picking. In this manner give the highest sum of coins he/she can collect.
Input:
First line contains 2 space-separated integers n and x respectively, which denote
n - Size of the array
x - Window size
Output:
A single integer denoting the max sum the player can obtain.
Working Soln Link: Ideone
long long solve(int n, int x) {
if (n == 0) return 0;
long long total = accumulate(arr + 1, arr + n + 1, 0ll);
if (x >= n) return total;
multiset<long long> dp_x;
for (int i = 1; i <= x + 1; i++) {
dp[i] = arr[i];
dp_x.insert(dp[i]);
}
for (int i = x + 2; i <= n; i++) {
dp[i] = arr[i] + *dp_x.begin();
dp_x.erase(dp_x.find(dp[i - x - 1]));
dp_x.insert(dp[i]);
}
long long ans = total;
for (int i = n - x; i <= n; i++) {
ans = min(ans, dp[i]);
}
return total - ans;
}
Can someone kindly explain how this code is working i.e., how line no. 12-26 in the Ideone solution is producing the correct answer?
I have dry run the code using pen and paper and found that it's giving the correct answer but couldn't figure out the algorithm used(if any). Can someone kindly explain to me how Line No. 12-26 is producing the correct answer? Is there any technique or algorithm at use here?
I am new to DP, so if someone can point out a tutorial(YouTube video, etc) related to this kind of problem, that would be great too. Thank you.
It looks like the idea is converting the problem - You must choose at least one coin in no more than x+1 coins in a row, and make it minimal. Then the original problem's answer would just be [sum of all values] - [answer of the new problem].
Then we're ready to talk about dynamic programming. Let's define a recurrence relation for f(i) which means "the partial answer of the new problem considering 1st to i-th coins, and i-th coin is chosen". (Sorry about the bad description, edits welcome)
f(i) = a(i) : if (i<=x+1)
f(i) = a(i) + min(f(i-1),f(i-2),...,f(i-x-1)) : otherwise
where a(i) is the i-th coin value
I added some comments line by line.
// NOTE f() is dp[] and a() is arr[]
long long solve(int n, int x) {
if (n == 0) return 0;
long long total = accumulate(arr + 1, arr + n + 1, 0ll); // get the sum
if (x >= n) return total;
multiset<long long> dp_x; // A min-heap (with fast random access)
for (int i = 1; i <= x + 1; i++) { // For 1 to (x+1)th,
dp[i] = arr[i]; // f(i) = a(i)
dp_x.insert(dp[i]); // Push the value to the heap
}
for (int i = x + 2; i <= n; i++) { // For the rest,
dp[i] = arr[i] + *dp_x.begin(); // f(i) = a(i) + min(...)
dp_x.erase(dp_x.find(dp[i - x - 1])); // Erase the oldest one from the heap
dp_x.insert(dp[i]); // Push the value to the heap, so it keeps the latest x+1 elements
}
long long ans = total;
for (int i = n - x; i <= n; i++) { // Find minimum of dp[] (among candidate answers)
ans = min(ans, dp[i]);
}
return total - ans;
}
Please also note that multiset is used as a min-heap. However we also need quick random-access(to erase the old ones) and multiset can do it in logarithmic time. So, the overall time complexity is O(n log x).
There is a ordered list like
A=[7, 9, 10, 11, 12, 13, 20]
and I have to find pairs a+b%10=k where 0<=k<=9
For example k = 0
Pairs: (7, 13), (9, 11), (10, 20)
How can i find the number of pairs in O(n) time?
I tried to find convert all the list with take mod(10)
for (auto i : A) {
if (i <= k) {
B.push_back(i);
}
else {
B.push_back(i % 10);
}
}
After that i tried to define summations that gives k via unorderep_map
unordered_map<int, int> sumList;
int j = k;
for (int i = 0; i < 10; i++) {
sumList[i] = j;
if (j==0) j=9;
j--;
}
But i can't figure out that how can i count the number of pairs in O(n), what can i do now?
Let’s begin with a simple example. Assume that k = 0. That means that we want to find the number of pairs that sum up to a multiple of 10. What would those pairs look like? Well, they could be formed by
adding up a number whose last digit is 1 with a number whose last digit is 9,
adding up a number whose last digit is 2 with a number whose last digit is 8,
adding up a number whose last digit is 3 with a number whose last digit is 7,
adding up a number whose last digit is 4 with a number whose last digit is 6, or
adding up two numbers whose last digit is 5, or
adding up two numbers whose last digit is 0.
So suppose you have a frequency table A where A[i] is the number of numbers with last digit i. Then the number of pairs of numbers whose last digits are i and j, respectively, is given by
A[i] * A[j] if i ≠ j, and
A[i] * A[i-1] / 2 if i = j.
Based on this, if you wanted to count the number of pairs summing to k mod 10, you could
fill in the A array, then
iterate over all possible pairs that sum to k, using the above formula to count up the number of pairs without explicitly listing all of them.
That last step takes time O(1), since there are only ten buckets and iterating over the pairs you need therefore requires at most a constant amount of work.
I’ll leave the rest of the details to you.
Hope this helps!
You can modify counting sort for this.
Below is an untested, unoptimized and only illustrative version:
int mods[10];
void count_mods(int nums[], int n) {
for (int i = 0; i < n; i++)
mods[nums[i]%10]++;
}
int count_pairs(int k) {
// TODO: there's definitely a better way to do this, but it's O(1) anyway..
int count = 0;
for (int i = 0; i < 10; i++)
for (int j = i+1; j < n; j++)
if ((i + j) % 10 == k) {
int pairs = mods[i] > mods[j] ? mods[j] : mods[i];
if (i == j)
pairs /= 2;
count += pairs;
}
return count;
}
EDIT:
With a smaller constant.
int mods[10];
void count_mods(int nums[], int n) {
for (int i = 0; i < n; i++)
mods[nums[i]%10]++;
}
int count_pairs(int k) {
int count = 0;
for (int i = 0; i < 10; i++) {
int j = k - i;
if (j < 0)
j += 10;
count += min(mods[i], mods[j]);
// When k = 2*i we count half (rounded down) the items to make the pairs.
// Thus, we substract the extra elements by rounding up the half.
if (i == j)
count -= (mods[i]+1) / 2;
}
// We counted everything twice.
return count / 2;
}
Given a sequence of n positive integers we need to count consecutive sub-sequences whose sum is divisible by k.
Constraints : N is up to 10^6 and each element up to 10^9 and K is up to 100
EXAMPLE : Let N=5 and K=3 and array be 1 2 3 4 1
Here answer is 4
Explanation : there exists, 4 sub-sequences whose sum is divisible by 3, they are
3
1 2
1 2 3
2 3 4
My Attempt :
long long int count=0;
for(int i=0;i<n;i++){
long long int sum=0;
for(int j=i;j<n;j++)
{
sum=sum+arr[j];
if(sum%k==0)
{
count++;
}
}
}
But obviously its poor approach. Can their be better approach for this question? Please help.
Complete Question: https://www.hackerrank.com/contests/w6/challenges/consecutive-subsequences
Here is a fast O(n + k) solution:
1)Lets compute prefix sums pref[i](for 0 <= i < n).
2)Now we can compute count[i] - the number of prefixes with sum i modulo k(0 <= i < k).
This can be done by iterating over all the prefixes and making count[pref[i] % k]++.
Initially, count[0] = 1(an empty prefix has sum 0) and 0 for i != 0.
3)The answer is sum count[i] * (count[i] - 1) / 2 for all i.
4)It is better to compute prefix sums modulo k to avoid overflow.
Why does it work? Let's take a closer a look at a subarray divisible by k. Let's say that it starts in L position and ends in R position. It is divisible by k if and only if pref[L - 1] == pref[R] (modulo k) because their differnce is zero modulo k(by definition of divisibility). So for each fixed modulo, we can pick any two prefixes with this prefix sum modulo k(and there are exactly count[i] * (count[i] - 1) / 2 ways to do it).
Here is my code:
long long get_count(const vector<int>& vec, int k) {
//Initialize count array.
vector<int> cnt_mod(k, 0);
cnt_mod[0] = 1;
int pref_sum = 0;
//Iterate over the input sequence.
for (int elem : vec) {
pref_sum += elem;
pref_sum %= k;
cnt_mod[pref_sum]++;
}
//Compute the answer.
long long res = 0;
for (int mod = 0; mod < k; mod++)
res += (long long)cnt_mod[mod] * (cnt_mod[mod] - 1) / 2;
return res;
}
That have to make your calculations easier:
//Now we will move all numbers to [0..K-1]
long long int count=0;
for(int i=0;i<n;i++){
arr[i] = arr[i]%K;
}
//Now we will calculate cout of all shortest subsequences.
long long int sum=0;
int first(0);
std::vector<int> beg;
std::vector<int> end;
for(int i=0;i<n;i++){
if (arr[i] == 0)
{
count++;
continue;
}
sum += arr[i];
if (sum == K)
{
beg.push_back(first);
end.push_back(i);
count++;
}
else
{
while (sum > K)
{
sum -= arr[first];
first++;
}
if (sum == K)
{
beg.push_back(first);
end.push_back(i);
count++;
}
}
}
//this way we found all short subsequences. And we need to calculate all subsequences that consist of some short subsequencies.
int party(0);
for (int i = 0; i < beg.size() - 1; ++i)
{
if (end[i] == beg[i+1])
{
count += party + 1;
party++;
}
else
{
party = 0;
}
}
So, with max array size = 10^6 and max size of rest = 99, you will not have overflow even if you will need to summ all numbers in simple int32.
And time you will spend will be around O(n+n)
Please can any one provide with a better algorithm then trying all the combinations for this problem.
Given an array A of N numbers, find the number of distinct pairs (i,
j) such that j >=i and A[i] = A[j].
First line of the input contains number of test cases T. Each test
case has two lines, first line is the number N, followed by a line
consisting of N integers which are the elements of array A.
For each test case print the number of distinct pairs.
Constraints:
1 <= T <= 10
1 <= N <= 10^6
-10^6 <= A[i] <= 10^6 for 0 <= i < N
I think that first sorting the array then finding frequency of every distinct integer and then adding nC2 of all the frequencies plus adding the length of the string at last. But unfortunately it gives wrong ans for some cases which are not known help. here is the implementation.
code:
#include <iostream>
#include<cstdio>
#include<algorithm>
using namespace std;
long fun(long a) //to find the aC2 for given a
{
if (a == 1) return 0;
return (a * (a - 1)) / 2;
}
int main()
{
long t, i, j, n, tmp = 0;
long long count;
long ar[1000000];
cin >> t;
while (t--)
{
cin >> n;
for (i = 0; i < n; i++)
{
cin >> ar[i];
}
count = 0;
sort(ar, ar + n);
for (i = 0; i < n - 1; i++)
{
if (ar[i] == ar[i + 1])
{
tmp++;
}
else
{
count += fun(tmp + 1);
tmp = 0;
}
}
if (tmp != 0)
{
count += fun(tmp + 1);
}
cout << count + n << "\n";
}
return 0;
}
Keep a count of how many times each number appears in an array. Then iterate over the result array and add the triangular number for each.
For example(from the source test case):
Input:
3
1 2 1
count array = {0, 2, 1} // no zeroes, two ones, one two
pairs = triangle(0) + triangle(2) + triangle(1)
pairs = 0 + 3 + 1
pairs = 4
Triangle numbers can be computed by (n * n + n) / 2, and the whole thing is O(n).
Edit:
First, there's no need to sort if you're counting frequency. I see what you did with sorting, but if you just keep a separate array of frequencies, it's easier. It takes more space, but since the elements and array length are both restrained to < 10^6, the max you'll need is an int[10^6]. This easily fits in the 256MB space requirements given in the challenge. (whoops, since elements can go negative, you'll need an array twice that size. still well under the limit, though)
For the n choose 2 part, the part you had wrong is that it's an n+1 choose 2 problem. Since you can pair each one by itself, you have to add one to n. I know you were adding n at the end, but it's not the same. The difference between tri(n) and tri(n+1) is not one, but n.