Perfect Square in Leetcode - c++

I am having trouble understanding one of a Leetcode Problem.
Given a positive integer n, find the least number of perfect square numbers (for example, 1, 4, 9, 16, ...) which sum to n.
For example, given n = 12, return 3 because 12 = 4 + 4 + 4; given n = 13, return 2 because 13 = 4 + 9.
Solution:
int numSquares(int n) {
static vector<int> dp {0};
while (dp.size() <= n) {
int m = dp.size(), squares = INT_MAX;
for (int i=1; i*i<=m; ++i)
squares = min(squares, dp[m-i*i] + 1);
dp.push_back(squares);
}
return dp[n];
}
I really dont understand what is going on with min(squares,dp[m-i*i]+1). Can you please explain?
thx.

I had a hard time with this too. Let's take the example number n=13.
First thing to observe is that: 1^2 =1, 2^2=4, 3^2=9, 4^2=16
So 13 can't be composed of anything greater than
3^2. Generically speaking, n can only be composed of numbers 1 to sqrt(n)
So we are left with some combination of the square of the following numbers: 1, 2, or 3.
Next thing we want to do is come up with the recursive formula. This took me a long time to understand. But we basically want to dwindle down to work with a smaller n (that's the whole point of recursion). We do that by subtracting our candidate perfect squares from n. For example:
If we try 3, then dp(13)=dp(13-3^2)+1=dp(4)+1.
The +1 is incrementing the count by 1 and is from the the fact that we already took off a perfect square from 13, which was the 3^2. Each +1 is a perfect square that we took off.
If we try 2, then dp(13)=13-2^2=dp(9)+1
If we try 1, then dp(13)=13-1^2=dp(12)+1
So we are left with comparing which is the smallest out of dp(4), dp(9), and dp(12). Hence the min.

The solution, which you have mentioned, is the bottom-up version of the algorithm. In order to understand the algorithm better, I would advice to look at the top-down version of the solution.
Let's look closer at the recurrence relation for the calculation of the minimal amount of the perfect squares, contained inside the number N. For given N and any arbitrary number x (which is the candidate for being considered as the member of the shortest sequence of numbers, whose perfect squares sums-up to N):
f(N, x) = 0 , if N = 0 ;
f(N, x) = min( f(N, x + 1), f(N - x^2, 1) ) , if N >= x^2 ;
f(N, x) = +infinity , otherwise ;
solution(N) = f(N, 1)
Now, having in mind the considered recurrence, we can construct the top-down solution (I will implement it in Java):
int solve(int n) {
return solve(n, 1);
}
int solve(int n, int curr) {
if (n == 0) {
return 0;
}
if ((curr * curr) > n) {
return POSITIVE_INFINITY;
}
// if curr belongs to the shortest sequence of numbers, whose perfect squares sums-up to N
int inclusive = solve(n - (curr * curr), 1) + 1;
// otherwise:
int exclusive = solve(n, curr + 1);
return Math.min(exclusive, inclusive);
}
The runtime complexity of the given solution is exponential.
However, we can notice that there are only [1..n] possible values of n and [1..sqrt(n)] values of curr. Which, implies, that there are only n * sqrt(n) combinations of different values of arguments of the function solve. Hence, we can create the memoization table and reduce the complexity of the top-down solution:
int solve(int n) {
// initialization of the memoization table
int[][] memoized = new int[n + 1][(int) (Math.sqrt(n) + 1)];
for (int[] row : memoized) {
Arrays.fill(row, NOT_INITIALIZED);
}
return solve(n, 1, memoized);
}
int solve(int n, int curr, int[][] memoized) {
if (n == 0) {
return 0;
}
if ((curr * curr) > n) {
return POSITIVE_INFINITY;
}
if (memoized[n][curr] != NOT_INITIALIZED) {
// the sub-problem has been already solved
return memoized[n][curr];
}
int exclusive = solve(n, curr + 1, memoized);
int inclusive = solve(n - (curr * curr), 1, memoized) + 1;
memoized[n][curr] = Math.min(exclusive, inclusive);
return memoized[n][curr];
}
Given solution has the runtime complexity O(N * sqrt(N)).
However, it is possible to reduce the runtime complexity to O(N).
As far as the recurrence relation for f(N, x) depends only on f(N, x + 1) and f(N - x^2, 1) - it means, that the relation can be equivalently transformed to the loop form:
f(0) = 0
f(N) = min( f(N - x^2) + 1 ) , across the all x, such that x^2 <= N
In this case we have to memoize the f(N) only for N different values of its argument.
Hence, below presented the O(N) top-down solution:
int solve_top_down_2(int n) {
int[] memoized = new int[n + 1];
Arrays.fill(memoized, NOT_INITIALIZED);
return solve_top_down_2(n, memoized);
}
int solve_top_down_2(int n, int[] memoized) {
if (n == 0) {
return 0;
}
if (memoized[n] != NOT_INITIALIZED) {
return memoized[n];
}
// if 1 belongs to the shortest sequence of numbers, whose perfect squares sums-up to N
int result = solve_top_down_2(n - (1 * 1)) + 1;
for (int curr = 2; (curr * curr) <= n; curr++) {
// check, whether some other number belongs to the shortest sequence of numbers, whose perfect squares sums-up to N
result = Math.min(result, solve_top_down_2(n - (curr * curr)) + 1);
}
memoized[n] = result;
return result;
}
Finally, the presented top-down solution can be easily transformed to the bottom-up solution:
int solve_bottom_up(int n) {
int[] memoized = new int[n + 1];
for (int i = 1; i <= n; i++) {
memoized[i] = memoized[i - (1 * 1)] + 1;
for (int curr = 2; (curr * curr) <= i; curr++) {
memoized[i] = Math.min(memoized[i], memoized[i - (curr * curr)] + 1);
}
}
return memoized[n];
}

The clarification to your confusion lies in the question itself. The structure dp holds the least number of squares that sum up to the index position of dp.
E.g., squares would return 3 when n=9, but least possible is 1, which is what dp[m- i*i] + 1 would return.

Related

Need optimization tips for a subset sum like problem with a big constraint

Given a number 1 <= N <= 3*10^5, count all subsets in the set {1, 2, ..., N-1} that sum up to N. This is essentially a modified version of the subset sum problem, but with a modification that the sum and number of elements are the same, and that the set/array increases linearly by 1 to N-1.
I think i have solved this using dp ordered map and inclusion/exclusion recursive algorithm, but due to the time and space complexity i can't compute more than 10000 elements.
#include <iostream>
#include <chrono>
#include <map>
#include "bigint.h"
using namespace std;
//2d hashmap to store values from recursion; keys- i & sum; value- count
map<pair<int, int>, bigint> hmap;
bigint counter(int n, int i, int sum){
//end case
if(i == 0){
if(sum == 0){
return 1;
}
return 0;
}
//alternative end case if its sum is zero before it has finished iterating through all of the possible combinations
if(sum == 0){
return 1;
}
//case if the result of the recursion is already in the hashmap
if(hmap.find(make_pair(i, sum)) != hmap.end()){
return hmap[make_pair(i, sum)];
}
//only proceed further recursion if resulting sum wouldnt be negative
if(sum - i < 0){
//optimization that skips unecessary recursive branches
return hmap[make_pair(i, sum)] = counter(n, sum, sum);
}
else{
//include the number dont include the number
return hmap[make_pair(i, sum)] = counter(n, i - 1, sum - i) + counter(n, i - 1, sum);
}
}
The function has starting values of N, N-1, and N, indicating number of elements, iterator(which decrements) and the sum of the recursive branch(which decreases with every included value).
This is the code that calculates the number of the subsets. for input of 3000 it takes around ~22 seconds to output the result which is 40 digits long. Because of the long digits i had to use an arbitrary precision library bigint from rgroshanrg, which works fine for values less than ~10000. Testing beyond that gives me a segfault on line 28-29, maybe due to the stored arbitrary precision values becoming too big and conflicting in the map. I need to somehow up this code so it can work with values beyond 10000 but i am stumped with it. Any ideas or should i switch towards another algorithm and data storage?
Here is a different algorithm, described in a paper by Evangelos Georgiadis, "Computing Partition Numbers q(n)":
std::vector<BigInt> RestrictedPartitionNumbers(int n)
{
std::vector<BigInt> q(n, 0);
// initialize q with A010815
for (int i = 0; ; i++)
{
int n0 = i * (3 * i - 1) >> 1;
if (n0 >= q.size())
break;
q[n0] = 1 - 2 * (i & 1);
int n1 = i * (3 * i + 1) >> 1;
if (n1 < q.size())
q[n1] = 1 - 2 * (i & 1);
}
// construct A000009 as per "Evangelos Georgiadis, Computing Partition Numbers q(n)"
for (size_t k = 0; k < q.size(); k++)
{
size_t j = 1;
size_t m = k + 1;
while (m < q.size())
{
if ((j & 1) != 0)
q[m] += q[k] << 1;
else
q[m] -= q[k] << 1;
j++;
m = k + j * j;
}
}
return q;
}
It's not the fastest algorithm out there, and this took about half a minute for on my computer for n = 300000. But you only need to do it once (since it computes all partition numbers up to some bound) and it doesn't take a lot of memory (a bit over 150MB).
The results go up to but excluding n, and they assume that for each number, that number itself is allowed to be a partition of itself eg the set {4} is a partition of the number 4, in your definition of the problem you excluded that case so you need to subtract 1 from the result.
Maybe there's a nicer way to express A010815, that part of the code isn't slow though, I just think it looks bad.

What is the time complexity of using max heap to solve "Find the K-th largest number in the array" problem?

"Find the K-th largest number in the array" problem:
inputs: [3,2,1,5,6,4], k = 2
outputs: 5
inputs: [3,2,3,1,2,4,5,5,6], k = 4
outputs: 4
I know that this problem can be soleved with quick select and min heap algorithms. However, what I'm focusing here is using max heap. The steps is the following:
1. Build max heap form the whole given array, inplace.
2. Iterate k times. In each iteration, Take and remove the heap top.
3. The last heap top taken at the k-th iteration is the result.
Below is the code in cpp:
void swap(vector<int>& nums, int i, int j) {
int tmp = nums[i];
nums[i] = nums[j];
nums[j] = tmp;
}
void heapify_down(vector<int>& nums, int parent_idx, int end_idx) {
int left_idx = parent_idx * 2 + 1, right_idx = left_idx + 1;
while (left_idx <= end_idx) {
int largeset_idx = parent_idx;
if (nums[left_idx] > nums[largeset_idx]) largeset_idx = left_idx;
if (right_idx <= end_idx && nums[right_idx] > nums[largeset_idx]) largeset_idx = right_idx;
if (largeset_idx != parent_idx) {
swap(nums, parent_idx, largeset_idx);
parent_idx = largeset_idx;
left_idx = parent_idx * 2 + 1;
right_idx = left_idx + 1;
}
else {
return ;
}
}
}
void build_heap(vector<int>& nums) {
for (int i = nums.size() - 1; i >= 0; i--) heapify_down(nums, i, nums.size() - 1);
}
int findKthLargest(vector<int>& nums, int k) {
build_heap(nums);
int cnt = 0, res, cur_end = nums.size() - 1;
while (cnt != k) {
res = nums[0];
cnt += 1;
swap(nums, 0, cur_end);
cur_end -= 1;
heapify_down(nums, 0, cur_end);
}
return res;
}
What is the time complextity of this method? I'm using bottom-up approach in the first step, this step should be O(n). However, the while loop makes me confused. The loop execute k times, each loop will call heapify_down which is O(log(n)) comlexity. So is the overall complexity O(n + k * log(n)) = O(max(n, k * log(n)))? Correct me if I'm wrong.
Building a heap, if you do it right, is O(n) and removing the top is O(log n) and with k being unbound it could be n.
So overall it is O(n + k * log n) == O(n + n * log n) == O(n log n).
Don't you have some more information, like k < log n or something?
When building your heap don't use heapify_down, because that results in O(n * log n). Instead first ignore the last element if the heap has odd size. Then start at i = size / 2 - 1 and swap i, 2 * i and 2 * i + 1 around so the largest is at i. Repeat till i-- = 0. If the size was odd add the last element to the heap.

Speed problem for summation (sum of divisors)

I should implement this summation in C ++. I have tried with this code, but with very high numbers up to 10 ^ 12 it takes too long.
The summation is:
For any positive integer k, let d(k) denote the number of positive divisors of k (including 1 and k itself).
For example, for the number 4: 1 has 1 divisor, 2 has two divisors, 3 has two divisors, and 4 has three divisors. So the result would be 8.
This is my code:
#include <iostream>
#include <algorithm>
using namespace std;
int findDivisors(long long n)
{
int c=0;
for(int j=1;j*j<=n;j++)
{
if(n%j==0)
{
c++;
if(j!=(n/j))
{
c++;
}
}
}
return c;
}
long long compute(long long n)
{
long long sum=0;
for(int i=1; i<=n; i++)
{
sum += (findDivisors(i));
}
return sum;
}
int main()
{
int n, divisors;
freopen("input.txt", "r", stdin);
freopen("output.txt", "w", stdout);
cin >> n;
cout << compute(n);
}
I think it's not just a simple optimization problem, but maybe I should change the algorithm entirely.
Would anyone have any ideas to speed it up? Thank you.
largest_prime_is_463035818's answer shows an O(N) solution, but the OP is trying to solve this problem
with very high numbers up to 1012.
The following is an O(N1/2) algorithm, based on some observations about the sum
n/1 + n/2 + n/3 + ... + n/n
In particular, we can count the number of terms with a specific value.
Consider all the terms n/k where k > n/2. There are n/2 of those and all are equal to 1 (integer division), so that their sum is n/2.
Similar considerations hold for the other dividends, so that we can write the following function
long long count_divisors(long long n)
{
auto sum{ n };
for (auto i{ 1ll }, k_old{ n }, k{ n }; i < k ; ++i, k_old = k)
{ // ^^^^^ it goes up to sqrt(n)
k = n / (i + 1);
sum += (k_old - k) * i;
if (i == k)
break;
sum += k;
}
return sum;
}
Here it is tested against the O(N) algorithm, the only difference in the results beeing the corner cases n = 0 and n = 1.
Edit
Thanks again to largest_prime_is_463035818, who linked the Wikipedia page about the divisor summatory function, where both an O(N) and an O(sqrt(N)) algorithm are mentioned.
An implementation of the latter may look like this
auto divisor_summatory(long long n)
{
auto sum{ 0ll };
auto k{ 1ll };
for ( ; k <= n / k; ++k )
{
sum += n / k;
}
--k;
return 2 * sum - k * k;
}
They also add this statement:
Finding a closed form for this summed expression seems to be beyond the techniques available, but it is possible to give approximations. The leading behavior of the series is given by
D(x) = xlogx + x(2γ - 1) + Δ(x)
where γ is the Euler–Mascheroni constant, and the error term is Δ(x) = O(sqrt(x)).
I used your brute force approach as reference to have test cases. The ones I used are
compute(12) == 35
cpmpute(100) == 482
Don't get confused by computing factorizations. There are some tricks one can play when factorizing numbers, but you actually don't need any of that. The solution is a plain simple O(N) loop:
#include <iostream>
#include <limits>
long long compute(long long n){
long long sum = n+1;
for (long long i=2; i < n ; ++i){
sum += n/i;
}
return sum;
}
int main()
{
std::cout << compute(12) << "\n";
std::cout << compute(100) << "\n";
}
Output:
35
482
Why does this work?
The key is in Marc Glisse's comment:
As often with this kind of problem, this sum actually counts pairs x,
y where x divides y, and the sum is arranged to count first all x
corresponding to a fixed y, but nothing says you have to keep it that
way.
I could stop here, because the comment already explains it all. Though, if it didn't click yet...
The trick is to realize that it is much simpler to count divisors of all numbers up to n rather than n-times counting divisors of individual numbers and take the sum.
You don't need to care about factorizations of eg 123123123 or 52323423 to count all divisors up to 10000000000. All you need is a change of perspective. Instead of trying to factorize numbers, consider the divisors. How often does the divisor 1 appear up to n? Simple: n-times. How often does the divisor 2 appear? Still simple: n/2 times, because every second number is divisible by 2. Divisor 3? Every 3rd number is divisible by 3. I hope you can see the pattern already.
You could even reduce the loop to only loop till n/2, because bigger numbers obviously appear only once as divisor. Though I didn't bother to go further, because the biggest change is from your O(N * sqrt(N)) to O(N).
Let's start off with some math and reduce the O(n * sq(n)) factorization to O(n * log(log(n))) and for counting the sum of divisors the overall complexity is O(n * log(log(n)) + n * n^(1/3)).
For instance:
In Codeforces himanshujaju explains how we can optimize the solution of finding divisors of a number.
I am simplifying it a little bit.
Let, n as the product of three numbers p, q, and r.
so assume p * q * r = n, where p <= q <= r.
The maximum value of p = n^(1/3).
Now we can loop over all prime numbers in a range [2, n^(1/3)]
and try to reduce the time complexity of prime factorization.
We will split our number n into two numbers x and y => x * y = n.
And x contains prime factors up to n^(1/3) and y deals with higher prime factors greater than n^(1/3).
Thus gcd(x, y) = 1.
Now define F(n) as the number of prime factors of n.
From multiplicative rules, we can say that
F(x * y) = F(x) * F(y), if gcd(x, y) = 1.
For finding F(n) => F(x * y) = F(x) * F(y)
So first find F(x) then F(y) will F(n/x)
And there will 3 cases to cover for y:
1. y is a prime number: F(y) = 2.
2. y is the square of a prime number: F(y) = 3.
3. y is a product of two distinct prime numbers: F(y) = 4.
So once we are done with finding F(x) and F(y), we are also done with finding F(x * y) or F(n).
In Cp-Algorithm there is also a nice explanation of how to count the number of divisors on a number. And also in GeeksForGeeks a nice coding example of how to count the number of divisors of a number in an efficient way. One can check the articles and can generate a nice solution to this problem.
C++ implementation
#include <bits/stdc++.h>
using namespace std;
const int maxn = 1e6 + 11;
bool prime[maxn];
bool primesquare[maxn];
int table[maxn]; // for storing primes
void SieveOfEratosthenes()
{
for(int i = 2; i < maxn; i++){
prime[i] = true;
}
for(int i = 0; i < maxn; i++){
primesquare[i] = false;
}
// 1 is not a prime number
prime[1] = false;
for(int p = 2; p * p < maxn; p++){
// If prime[p] is not changed, then
// it is a prime
if(prime[p] == true){
// Update all multiples of p
for(int i = p * 2; i < maxn; i += p){
prime[i] = false;
}
}
}
int j = 0;
for(int p = 2; p < maxn; p++) {
if (prime[p]) {
// Storing primes in an array
table[j] = p;
// Update value in primesquare[p * p],
// if p is prime.
if(p < maxn / p) primesquare[p * p] = true;
j++;
}
}
}
// Function to count divisors
int countDivisors(int n)
{
// If number is 1, then it will have only 1
// as a factor. So, total factors will be 1.
if (n == 1)
return 1;
// ans will contain total number of distinct
// divisors
int ans = 1;
// Loop for counting factors of n
for(int i = 0;; i++){
// table[i] is not less than cube root n
if(table[i] * table[i] * table[i] > n)
break;
// Calculating power of table[i] in n.
int cnt = 1; // cnt is power of prime table[i] in n.
while (n % table[i] == 0){ // if table[i] is a factor of n
n = n / table[i];
cnt = cnt + 1; // incrementing power
}
// Calculating the number of divisors
// If n = a^p * b^q then total divisors of n
// are (p+1)*(q+1)
ans = ans * cnt;
}
// if table[i] is greater than cube root of n
// First case
if (prime[n])
ans = ans * 2;
// Second case
else if (primesquare[n])
ans = ans * 3;
// Third case
else if (n != 1)
ans = ans * 4;
return ans; // Total divisors
}
int main()
{
SieveOfEratosthenes();
int sum = 0;
int n = 5;
for(int i = 1; i <= n; i++){
sum += countDivisors(i);
}
cout << sum << endl;
return 0;
}
Output
n = 4 => 8
n = 5 => 10
Complexity
Time complexity: O(n * log(log(n)) + n * n^(1/3))
Space complexity: O(n)
Thanks, #largest_prime_is_463035818 for pointing out my mistake.

String decode: looking for a better approach

I have worked out a O(n square) solution to the problem. I was wondering about a better solution to this. (this is not a homework/interview problem but something I do out of my own interest, hence sharing here):
If a=1, b=2, c=3,….z=26. Given a string, find all possible codes that string
can generate. example: "1123" shall give:
aabc //a = 1, a = 1, b = 2, c = 3
kbc // since k is 11, b = 2, c= 3
alc // a = 1, l = 12, c = 3
aaw // a= 1, a =1, w= 23
kw // k = 11, w = 23
Here is my code to the problem:
void alpha(int* a, int sz, vector<vector<int>>& strings) {
for (int i = sz - 1; i >= 0; i--) {
if (i == sz - 1) {
vector<int> t;
t.push_back(a[i]);
strings.push_back(t);
} else {
int k = strings.size();
for (int j = 0; j < k; j++) {
vector<int> t = strings[j];
strings[j].insert(strings[j].begin(), a[i]);
if (t[0] < 10) {
int n = a[i] * 10 + t[0];
if (n <= 26) {
t[0] = n;
strings.push_back(t);
}
}
}
}
}
}
Essentially the vector strings will hold the sets of numbers.
This would run in n square. I am trying my head around at least an nlogn solution.
Intuitively tree should help here, but not getting anywhere post that.
Generally, your problem complexity is more like 2^n, not n^2, since your k can increase with every iteration.
This is an alternative recursive solution (note: recursion is bad for very long codes). I didn't focus on optimization, since I'm not up to date with C++X, but I think the recursive solution could be optimized with some moves.
Recursion also makes the complexity a bit more obvious compared to the iterative solution.
// Add the front element to each trailing code sequence. Create a new sequence if none exists
void update_helper(int front, std::vector<std::deque<int>>& intermediate)
{
if (intermediate.empty())
{
intermediate.push_back(std::deque<int>());
}
for (size_t i = 0; i < intermediate.size(); i++)
{
intermediate[i].push_front(front);
}
}
std::vector<std::deque<int>> decode(int digits[], int count)
{
if (count <= 0)
{
return std::vector<std::deque<int>>();
}
std::vector<std::deque<int>> result1 = decode(digits + 1, count - 1);
update_helper(*digits, result1);
if (count > 1 && (digits[0] * 10 + digits[1]) <= 26)
{
std::vector<std::deque<int>> result2 = decode(digits + 2, count - 2);
update_helper(digits[0] * 10 + digits[1], result2);
result1.insert(result1.end(), result2.begin(), result2.end());
}
return result1;
}
Call:
std::vector<std::deque<int>> strings = decode(codes, size);
Edit:
Regarding the complexity of the original code, I'll try to show what would happen in the worst case scenario, where the code sequence consists only of 1 and 2 values.
void alpha(int* a, int sz, vector<vector<int>>& strings)
{
for (int i = sz - 1;
i >= 0;
i--)
{
if (i == sz - 1)
{
vector<int> t;
t.push_back(a[i]);
strings.push_back(t); // strings.size+1
} // if summary: O(1), ignoring capacity change, strings.size+1
else
{
int k = strings.size();
for (int j = 0; j < k; j++)
{
vector<int> t = strings[j]; // O(strings[j].size) vector copy operation
strings[j].insert(strings[j].begin(), a[i]); // strings[j].size+1
// note: strings[j].insert treated as O(1) because other containers could do better than vector
if (t[0] < 10)
{
int n = a[i] * 10 + t[0];
if (n <= 26)
{
t[0] = n;
strings.push_back(t); // strings.size+1
// O(1), ignoring capacity change and copy operation
} // if summary: O(1), strings.size+1
} // if summary: O(1), ignoring capacity change, strings.size+1
} // for summary: O(k * strings[j].size), strings.size+k, strings[j].size+1
} // else summary: O(k * strings[j].size), strings.size+k, strings[j].size+1
} // for summary: O(sum[i from 1 to sz] of (k * strings[j].size))
// k (same as string.size) doubles each iteration => k ends near 2^sz
// string[j].size increases by 1 each iteration
// k * strings[j].size increases by ?? each iteration (its getting huge)
}
Maybe I made a mistake somewhere and if we want to play nice we can treat a vector copy as O(1) instead of O(n) in order to reduce complexity, but the hard fact remains, that the worst case is doubling outer vector size in each iteration (at least every 2nd iteration, considering the exact structure of the if conditions) of the inner loop and the inner loop depends on that growing vector size, which makes the whole story at least O(2^n).
Edit2:
I figured out the result complexity (the best hypothetical algoritm still needs to create every element of the result, so result complexity is like a lower bound to what any algorithm can archieve)
Its actually following the Fibonacci numbers:
For worst case input (like only 1s) of size N+2 you have:
size N has k(N) elements
size N+1 has k(N+1) elements
size N+2 is the combination of codes starting with a followed by the combinations from size N+1 (a takes one element of the source) and the codes starting with k, followed by the combinations from size N (k takes two elements of the source)
size N+2 has k(N) + k(N+1) elements
Starting with size 1 => 1 (a) and size 2 => 2 (aa or k)
Result: still exponential growth ;)
Edit3:
Worked out a dynamic programming solution, somewhat similar to your approach with reverse iteration over the code array and kindof optimized in its vector usage, based on the properties explained in Edit2.
The inner loop (update_helper) is still dominated by the count of results (worst case Fibonacci) and a few outer loop iterations will have a decent count of sub-results, but at least the sub-results are reduced to a pointer to some intermediate node, so copying should be pretty efficient. As a little bonus, I switched the result from numbers to characters.
Another edit: updated code with range 0 - 25 as 'a' - 'z', fixed some errors that led to wrong results.
struct const_node
{
const_node(char content, const_node* next)
: next(next), content(content)
{
}
const_node* const next;
const char content;
};
// put front in front of each existing sub-result
void update_helper(int front, std::vector<const_node*>& intermediate)
{
for (size_t i = 0; i < intermediate.size(); i++)
{
intermediate[i] = new const_node(front + 'a', intermediate[i]);
}
if (intermediate.empty())
{
intermediate.push_back(new const_node(front + 'a', NULL));
}
}
std::vector<const_node*> decode_it(int digits[9], size_t count)
{
int current = 0;
std::vector<const_node*> intermediates[3];
for (size_t i = 0; i < count; i++)
{
current = (current + 1) % 3;
int prev = (current + 2) % 3; // -1
int prevprev = (current + 1) % 3; // -2
size_t index = count - i - 1; // invert direction
// copy from prev
intermediates[current] = intermediates[prev];
// update current (part 1)
update_helper(digits[index], intermediates[current]);
if (index + 1 < count && digits[index] &&
digits[index] * 10 + digits[index + 1] < 26)
{
// update prevprev
update_helper(digits[index] * 10 + digits[index + 1], intermediates[prevprev]);
// add to current (part 2)
intermediates[current].insert(intermediates[current].end(), intermediates[prevprev].begin(), intermediates[prevprev].end());
}
}
return intermediates[current];
}
void cleanupDelete(std::vector<const_node*>& nodes);
int main()
{
int code[] = { 1, 2, 3, 1, 2, 3, 1, 2, 3 };
int size = sizeof(code) / sizeof(int);
std::vector<const_node*> result = decode_it(code, size);
// output
for (size_t i = 0; i < result.size(); i++)
{
std::cout.width(3);
std::cout.flags(std::ios::right);
std::cout << i << ": ";
const_node* item = result[i];
while (item)
{
std::cout << item->content;
item = item->next;
}
std::cout << std::endl;
}
cleanupDelete(result);
}
void fillCleanup(const_node* n, std::set<const_node*>& all_nodes)
{
if (n)
{
all_nodes.insert(n);
fillCleanup(n->next, all_nodes);
}
}
void cleanupDelete(std::vector<const_node*>& nodes)
{
// this is like multiple inverse trees, hard to delete correctly, since multiple next pointers refer to the same target
std::set<const_node*> all_nodes;
for each (auto var in nodes)
{
fillCleanup(var, all_nodes);
}
nodes.clear();
for each (auto var in all_nodes)
{
delete var;
}
all_nodes.clear();
}
A drawback of the dynamically reused structure is the cleanup, since you wanna be careful to delete each node only once.

count distinct slices in an array

I was trying to solve this problem.
An integer M and a non-empty zero-indexed array A consisting of N
non-negative integers are given. All integers in array A are less than
or equal to M.
A pair of integers (P, Q), such that 0 ≤ P ≤ Q < N, is called a slice
of array A. The slice consists of the elements A[P], A[P + 1], ...,
A[Q]. A distinct slice is a slice consisting of only unique numbers.
That is, no individual number occurs more than once in the slice.
For example, consider integer M = 6 and array A such that:
A[0] = 3
A[1] = 4
A[2] = 5
A[3] = 5
A[4] = 2
There are exactly nine distinct slices: (0, 0), (0, 1), (0, 2), (1,
1), (1,2), (2, 2), (3, 3), (3, 4) and (4, 4).
The goal is to calculate the number of distinct slices.
Thanks in advance.
#include <algorithm>
#include <cstring>
#include <cmath>
#define MAX 100002
// you can write to stdout for debugging purposes, e.g.
// cout << "this is a debug message" << endl;
using namespace std;
bool check[MAX];
int solution(int M, vector<int> &A) {
memset(check, false, sizeof(check));
int base = 0;
int fibot = 0;
int sum = 0;
while(fibot < A.size()){
if(check[A[fibot]]){
base = fibot;
}
check[A[fibot]] = true;
sum += fibot - base + 1;
fibot += 1;
}
return min(sum, 1000000000);
}
The solution is not correct because your algorithm is wrong.
First of all, let me show you a counter example. Let A = {2, 1, 2}. The first iteration: base = 0, fibot = 0, sum += 1. That's right. The second one: base = 0, fibot = 1, sum += 2. That's correct, too. The last step: fibot = 2, check[A[fibot]] is true, thus, base = 2. But it should be 1. So your code returns1 + 2 + 1 = 4 while the right answer 1 + 2 + 2 = 5.
The right way to do it could be like this: start with L = 0. For each R from 0 to n - 1, keep moving the L to the right until the subarray contais only distinct values (you can maintain the number of occurrences of each value in an array and use the fact that A[R] is the only element that can occur more than once).
There is one more issue with your code: the sum variable may overflow if int is 32-bit type on the testing platform (for instance, if all elements of A are distinct).
As for the question WHY your algorithm is incorrect, I have no idea why it should be correct in the first place. Can you prove it? The base = fibot assignment looks quite arbitrary to me.
I would like to share the explanation of the algorithm that I have implemented in C++ followed by the actual implementation.
Notice that the minimum amount of distinct slices is N because each element is a distinct one-item slice.
Start the back index from the first element.
Start the front index from the first element.
Advance the front until we find a duplicate in the sequence.
In each iteration, increment the counter with the necessary amount, this is the difference between front and back.
If we reach the maximum counts at any iteration, just return immediately for slight optimisation.
In each iteration of the sequence, record the elements that have occurred.
Once we have found a duplicate, advance the back index one ahead of the duplicate.
While we advance the back index, clear all the occurred elements since we start a new slice beyond those elements.
The runtime complexity of this solution is O(N) since we go through each
element.
The space complexity of this solution is O(M) because we have a hash to store
the occurred elements in the sequences. The maximum element of this hash is M.
int solution(int M, vector<int> &A)
{
int N = A.size();
int distinct_slices = N;
vector<bool> seq_hash(M + 1, false);
for (int back = 0, front = 0; front < N; ++back) {
while (front < N and !seq_hash[A[front]]) { distinct_slices += front - back; if (distinct_slices > 1000000000) return 1000000000; seq_hash[A[front++]] = true; }
while (front < N and back < N and A[back] != A[front]) seq_hash[A[back++]] = false;
seq_hash[A[back]] = false;
}
return distinct_slices;
}
100% python solution that helped me, thanks to https://www.martinkysel.com/codility-countdistinctslices-solution/
def solution(M, A):
the_sum = 0
front = back = 0
seen = [False] * (M+1)
while (front < len(A) and back < len(A)):
while (front < len(A) and seen[A[front]] != True):
the_sum += (front-back+1)
seen[A[front]] = True
front += 1
else:
while front < len(A) and back < len(A) and A[back] != A[front]:
seen[A[back]] = False
back += 1
seen[A[back]] = False
back += 1
return min(the_sum, 1000000000)
Solution with 100% using Ruby
LIMIT = 1_000_000_000
def solution(_m, a)
a.each_with_index.inject([0, {}]) do |(result, slice), (back, i)|
return LIMIT if result >= LIMIT
slice[back] = true
a[(i + slice.size)..-1].each do |front|
break if slice[front]
slice[front] = true
end
slice.delete back
[result + slice.size, slice]
end.first + a.size
end
Using Caterpillar algorithm and the formula that S(n+1) = S(n) + n + 1 where S(n) is count of slices for n-element array java solution could be:
public int solution(int top, int[] numbers) {
int len = numbers.length;
long count = 0;
if (len == 1) return 1;
int front = 0;
int[] counter = new int[top + 1];
for (int i = 0; i < len; i++) {
while(front < len && counter[numbers[front]] == 0 ) {
count += front - i + 1;
counter[numbers[front++]] = 1;
}
while(front < len && numbers[i] != numbers[front] && i < front) {
counter[numbers[i++]] = 0;
}
counter[numbers[i]] = 0;
if (count > 1_000_000_000) {
return 1_000_000_000;
}
}
return count;
}