Unable to understand algorithm to find maximum sum of Subarray - c++

I am looking at the algorithm used to obtain the maximum sum of a subarray within an array and am unable to understand the logic behind the code. Specifically, this line max_ending = max(0, max_ending + number). I don't understand what is being done here. Also, would this algorithm have a complexity of O(n) or O(n^2)?:
#include <vector>
#include <algorithm>
using namespace std;
template <typename T> T max_sub_array (vector<T> const & numbers)
{
T max_ending = 0; max_so_far = 0;
for(auto & number: numbers)
{
max_ending = max(0, max_ending + number);
max_so_far = max(max_so_far, max_ending);
}
return max_so_far;
}
Thank You

The algorithm you presented seems to be the one attributed (in Wikipedia) to Jay Kadane. The line, max_ending = max(0, max_ending + number), means we are looking only at non-negative sums; in other words, if adding one more element to the current subarray would result in a negative sum, then make the maximum sum ending at this index zero (i.e., the subarray is empty again). The line relies on the idea that the only time we need to reset the examined subarray window is if it drops below zero — even if large positive elements could be added later, a larger subarray sum would be attained without a drop to negative in the middle. Let's look at an example (max_ending means the maximum sum for a subarray ending at the current index):
{1,2,23,-4,3,-10}
max_ending 1 3 26 22 25 15
max_so_far 1 3 26 26 26 26
Time complexity for this algorithm is assessed as O(n) since each array element needs to be visited once, and the number of iterations is dependent on the array size in a linear fashion. O(n^2) would mean that for each array element, the number of iterations would be on the order of the array size; so as the array size increases, the number of iterations would increase quadratically.

This is taken from exercise 4.1-5 of the Introduction to Algorithms Book.
Since this question and its answer addresses this. I think this may be helpful.
"Use the following ideas to develop a nonrecursive, linear-time algorithm for the maximum-subarray problem. Start at the left end of the array, and progress toward the right,keeping track of the maximum subarray seen so far. Knowing a maximum subarray of A[1.. j] , extend the answer to find a maximum subarray ending at index j + 1 by using the following observation: a maximum subarray of A[1.. j + 1] is either a maximum subarray of A[1.. j] or a subarray A[i .. j +1] , for some 1 <= i<= j + 1. Determine a maximum subarray of the form A[i... j+1] in constant time based on knowing a maximum subarray ending at index j."
Answer
First we need to figure out the maximum sub array ending at index j + 1 which could be just A[j + 1] or the maximum subarray ending at j plus A[j+1] , therefore we find the max of this two.
Once we have the maximum sub array ending at index j + 1 we find the maximum of A[1..j+1] by again, getting the max between maximum sub array ending at index j + 1 and maximum subarray of A[1..j].
So basically the idea is getting the maximum sub array ending at the current index of each iteration and getting the max between this and the max of the previous iteration.
Also I think this is incorrect
Edit: This will depend on definition, if the array doesn't need to contain at least one positive number. Otherwise yours is OK.
max_ending = max(0, max_ending + number);
it should be :
max_ending = max(number, max_ending + number);
Also max_ending and max_so_far should start at numbers[0] and the loop at index 1, if you follow mi change.
The complexity for this algorithm is O(n)
Additional note:
In your version there is no need to get the max between number and max_ending + number because max_ending >= 0 so number <= number + max_ending

https://app.codility.com/programmers/lessons/9-maximum_slice_problem/max_slice_sum/
The coding test would help to understand more about what to choose 0 or others for max_ending. As #גלעד ברקן said, it's important decision point whether to discard the slice so far.

Let's say sum[i] is the sum of sequence end with numbers[i], then sum[i] = max(numbers[i], numbers[i] + sum[i-1]), the answer is max(sum[i] for i from 0 to numbers.size() - 1)
See Maximum subarray problem

Related

minimum total move to balance array if we can increase/decrease a specific array element by 1

It is leetcode 462.
I have one algorithm but it failed some tests while passing others.
I tried to think through but not sure what is the corner case that i overlooked.
We have one array of N elements. One move is defined as increasing OR decreasing one single element of the array by 1. We are trying to find the minimum number of moves to make all elements equal.
My idea is:
1. find the average
2. find the element closest to the average
3. sum together the difference between each element and the element closest to the average.
What am i missing? Please provide one counter example.
class Solution {
public:
int minMoves2(vector<int>& nums) {
int sum=0;
for(int i=0;i<nums.size();i++){
sum += nums[i];
}
double avg = (double) sum / nums.size();
int min = nums[0];
int index =0 ;
for(int i=0;i<nums.size();i++){
if(abs(nums[i]-avg) <= abs(min - avg)){
min = nums[i];
index = i;
}
}
sum=0;
for(int i=0;i<nums.size();i++){
sum += abs(min - nums[i]);
}
return sum;
}
};
Suppose the array is [1, 1, 10, 20, 100]. The average is a bit over 20. So your solution would involving 19 + 19 + 10 + 0 + 80 moves = 128. What if we target 10 instead? Then we have 9 + 9 + 0 + 10 + 90 moves = 118. So this is a counter example.
Suppose you decide to target changing all array elements to some value T. The question is, what's the right value for T? Given some value of T, we could ask if increasing or decreasing T by 1 will improve or worsen our outcome. If we decrease T by 1, then all values greater than T need an extra move, and all those below need one move less. That means that if T is above the median, there are more values below it than above, and so we benefit from decreasing T. We can make the opposite argument if T is less than the median. From this we can conclude that the correct value of T is actually the median itself, which my example demonstreates (strictly speaking, when you have an even sized array, T can be anywhere between the two middle elements).

[Competitive Programming]:How do I optimise this brute force method? [duplicate]

If n numbers are given, how would I find the total number of possible triangles? Is there any method that does this in less than O(n^3) time?
I am considering a+b>c, b+c>a and a+c>b conditions for being a triangle.
Assume there is no equal numbers in given n and it's allowed to use one number more than once. For example, we given a numbers {1,2,3}, so we can create 7 triangles:
1 1 1
1 2 2
1 3 3
2 2 2
2 2 3
2 3 3
3 3 3
If any of those assumptions isn't true, it's easy to modify algorithm.
Here I present algorithm which takes O(n^2) time in worst case:
Sort numbers (ascending order).
We will take triples ai <= aj <= ak, such that i <= j <= k.
For each i, j you need to find largest k that satisfy ak <= ai + aj. Then all triples (ai,aj,al) j <= l <= k is triangle (because ak >= aj >= ai we can only violate ak < a i+ aj).
Consider two pairs (i, j1) and (i, j2) j1 <= j2. It's easy to see that k2 (found on step 2 for (i, j2)) >= k1 (found one step 2 for (i, j1)). It means that if you iterate for j, and you only need to check numbers starting from previous k. So it gives you O(n) time complexity for each particular i, which implies O(n^2) for whole algorithm.
C++ source code:
int Solve(int* a, int n)
{
int answer = 0;
std::sort(a, a + n);
for (int i = 0; i < n; ++i)
{
int k = i;
for (int j = i; j < n; ++j)
{
while (n > k && a[i] + a[j] > a[k])
++k;
answer += k - j;
}
}
return answer;
}
Update for downvoters:
This definitely is O(n^2)! Please read carefully "An Introduction of Algorithms" by Thomas H. Cormen chapter about Amortized Analysis (17.2 in second edition).
Finding complexity by counting nested loops is completely wrong sometimes.
Here I try to explain it as simple as I could. Let's fix i variable. Then for that i we must iterate j from i to n (it means O(n) operation) and internal while loop iterate k from i to n (it also means O(n) operation). Note: I don't start while loop from the beginning for each j. We also need to do it for each i from 0 to n. So it gives us n * (O(n) + O(n)) = O(n^2).
There is a simple algorithm in O(n^2*logn).
Assume you want all triangles as triples (a, b, c) where a <= b <= c.
There are 3 triangle inequalities but only a + b > c suffices (others then hold trivially).
And now:
Sort the sequence in O(n * logn), e.g. by merge-sort.
For each pair (a, b), a <= b the remaining value c needs to be at least b and less than a + b.
So you need to count the number of items in the interval [b, a+b).
This can be simply done by binary-searching a+b (O(logn)) and counting the number of items in [b,a+b) for every possibility which is b-a.
All together O(n * logn + n^2 * logn) which is O(n^2 * logn). Hope this helps.
If you use a binary sort, that's O(n-log(n)), right? Keep your binary tree handy, and for each pair (a,b) where a b and c < (a+b).
Let a, b and c be three sides. The below condition must hold for a triangle (Sum of two sides is greater than the third side)
i) a + b > c
ii) b + c > a
iii) a + c > b
Following are steps to count triangle.
Sort the array in non-decreasing order.
Initialize two pointers ‘i’ and ‘j’ to first and second elements respectively, and initialize count of triangles as 0.
Fix ‘i’ and ‘j’ and find the rightmost index ‘k’ (or largest ‘arr[k]‘) such that ‘arr[i] + arr[j] > arr[k]‘. The number of triangles that can be formed with ‘arr[i]‘ and ‘arr[j]‘ as two sides is ‘k – j’. Add ‘k – j’ to count of triangles.
Let us consider ‘arr[i]‘ as ‘a’, ‘arr[j]‘ as b and all elements between ‘arr[j+1]‘ and ‘arr[k]‘ as ‘c’. The above mentioned conditions (ii) and (iii) are satisfied because ‘arr[i] < arr[j] < arr[k]'. And we check for condition (i) when we pick 'k'
4.Increment ‘j’ to fix the second element again.
Note that in step 3, we can use the previous value of ‘k’. The reason is simple, if we know that the value of ‘arr[i] + arr[j-1]‘ is greater than ‘arr[k]‘, then we can say ‘arr[i] + arr[j]‘ will also be greater than ‘arr[k]‘, because the array is sorted in increasing order.
5.If ‘j’ has reached end, then increment ‘i’. Initialize ‘j’ as ‘i + 1′, ‘k’ as ‘i+2′ and repeat the steps 3 and 4.
Time Complexity: O(n^2).
The time complexity looks more because of 3 nested loops. If we take a closer look at the algorithm, we observe that k is initialized only once in the outermost loop. The innermost loop executes at most O(n) time for every iteration of outer most loop, because k starts from i+2 and goes upto n for all values of j. Therefore, the time complexity is O(n^2).
I have worked out an algorithm that runs in O(n^2 lgn) time. I think its correct...
The code is wtitten in C++...
int Search_Closest(A,p,q,n) /*Returns the index of the element closest to n in array
A[p..q]*/
{
if(p<q)
{
int r = (p+q)/2;
if(n==A[r])
return r;
if(p==r)
return r;
if(n<A[r])
Search_Closest(A,p,r,n);
else
Search_Closest(A,r,q,n);
}
else
return p;
}
int no_of_triangles(A,p,q) /*Returns the no of triangles possible in A[p..q]*/
{
int sum = 0;
Quicksort(A,p,q); //Sorts the array A[p..q] in O(nlgn) expected case time
for(int i=p;i<=q;i++)
for(int j =i+1;j<=q;j++)
{
int c = A[i]+A[j];
int k = Search_Closest(A,j,q,c);
/* no of triangles formed with A[i] and A[j] as two sides is (k+1)-2 if A[k] is small or equal to c else its (k+1)-3. As index starts from zero we need to add 1 to the value*/
if(A[k]>c)
sum+=k-2;
else
sum+=k-1;
}
return sum;
}
Hope it helps........
possible answer
Although we can use binary search to find the value of 'k' hence improve time complexity!
N0,N1,N2,...Nn-1
sort
X0,X1,X2,...Xn-1 as X0>=X1>=X2>=...>=Xn-1
choice X0(to Xn-3) and choice form rest two item x1...
choice case of (X0,X1,X2)
check(X0<X1+X2)
OK is find and continue
NG is skip choice rest
It seems there is no algorithm better than O(n^3). In the worst case, the result set itself has O(n^3) elements.
For Example, if n equal numbers are given, the algorithm has to return n*(n-1)*(n-2) results.

Number of Increasing Subsequences of length k

I am trying to understand the algorithm that gives me the number of increasing subsequences of length K in an array in time O(nklog(n)). I know how to solve this very same problem using the O(k*n^2) algorithm. I have looked up and found out this solution uses BIT (Fenwick Tree) and DP. I have also found some code, but I have not been able to understand it.
Here are some links I've visited that have been helpful.
Here in SO
Topcoder forum
Random webpage
I would really appreciate if some can help me out understand this algorithm.
I am reproducing my algorithm from here, where its logic is explained:
dp[i, j] = same as before num[i] = how many subsequences that end with i (element, not index this time)
have a certain length
for i = 1 to n do dp[i, 1] = 1
for p = 2 to k do // for each length this time num = {0}
for i = 2 to n do
// note: dp[1, p > 1] = 0
// how many that end with the previous element
// have length p - 1
num[ array[i - 1] ] += dp[i - 1, p - 1] *1*
// append the current element to all those smaller than it
// that end an increasing subsequence of length p - 1,
// creating an increasing subsequence of length p
for j = 1 to array[i] - 1 do *2*
dp[i, p] += num[j]
You can optimize *1* and *2* by using segment trees or binary indexed trees. These will be used to efficiently process the following operations on the num array:
Given (x, v) add v to num[x] (relevant for *1*);
Given x, find the sum num[1] + num[2] + ... + num[x] (relevant for *2*).
These are trivial problems for both data structures.
Note: This will have complexity O(n*k*log S), where S is the upper bound on the values in your array. This may or may not be good enough. To make it O(n*k*log n), you need to normalize the values of your array prior to running the above algorithm. Normalization means converting all of your array values into values lower than or equal to n. So this:
5235 223 1000 40 40
Becomes:
4 2 3 1 1
This can be accomplished with a sort (keep the original indexes).

Given an array of N numbers,find the number of sequences of all lengths having the range of R?

This is a follow up question to Given a sequence of N numbers ,extract number of sequences of length K having range less than R?
I basically need a vector v as an answer of size N such that V[i] denotes number of sequences of length i which have range <=R.
Traditionally, in recursive solutions, you would compute the solution for K = 0, K = 1, and then find some kind of recurrence relation between subsequent elements to avoid recomputing the solution from scratch each time.
However here I believe that maybe attacking the problem from the other side would be interesting, because of the property of the spread:
Given a sequence of spread R (or less), any subsequence has a spread inferior to R as well
Therefore, I would first establish a list of the longest subsequences of spread R beginning at each index. Let's call this list M, and have M[i] = j where j is the higher index in S (the original sequence) for which S[j] - S[i] <= R. This is going to be O(N).
Now, for any i, the number of sequences of length K starting at i is either 0 or 1, and this depends whether K is greater than M[i] - i or not. A simple linear pass over M (from 0 to N-K) gives us the answer. This is once again O(N).
So, if we call V the resulting vector, with V[k] denoting the number of subsequences of length K in S with spread inferior to R, then we can do it in a single iteration over M:
for i in [0, len(M)]:
for k in [0, M[i] - i]:
++V[k]
The algorithm is simple, however the number of updates can be rather daunting. In the worst case, supposing than M[i] - i equals N - i, it is O(N*N) complexity. You would need a better data structure (probably an adaptation of a Fenwick Tree) to use this algorithm an lower the cost of computing those numbers.
If you are looking for contiguous sequences, try doing it recursively : The K-length subsequences set having a range inferior than R are included in the (K-1)-length subsequences set.
At K=0, you have N solutions.
Each time you increase K, you append (resp. prepend) the next (resp.previous) element, check if it the range is inferior to R, and either store it in a set (look for duplicates !) or discard it depending on the result.
If think the complexity of this algorithm is O(n*n) in the worst-case scenario, though it may be better on average.
I think Matthieu has the right answer when looking for all sequences with spread R.
As you are only looking for sequences of length K, you can do a little better.
Instead of looking at the maximum sequence starting at i, just look at the sequence of length K starting at i, and see if it has range R or not. Do this for every i, and you have all sequences of length K with spread R.
You don't need to go through the whole list, as the latest start point for a sequence of length K is n-K+1. So complexity is something like (n-K+1)*K = n*K - K*K + K. For K=1 this is n,
and for K=n it is n. For K=n/2 it is n*n/2 - n*n/4 + n/2 = n*n/2 + n/2, which I think is the maximum. So while this is still O(n*n), for most values of K you get a little better.
Start with a simpler problem: count the maximal length of sequences, starting at each index and having the range, equal to R.
To do this, let first pointer point to the first element of the array. Increase second pointer (also starting from the first element of the array) while sequence between pointers has the range, less or equal to R. Push every array element, passed by second pointer, to min-max-queue, made of a pair of mix-max-stacks, described in this answer. When difference between max and min values, reported by min-max-queue exceeds R, stop increasing second pointer, increment V[ptr2-ptr1], increment first pointer (removing element, pointed by it, from min-max-queue), and continue increasing second pointer (keeping range under control).
When second pointer leaves bounds of the array, increment V[N-ptr1] for all remaining ptr1 (corresponding ranges may be less or equal to R). To add all other ranges, that are less than R, compute cumulative sum of array V[], starting from its end.
Both time and space complexities are O(N).
Pseudo-code:
p1 = p2 = 0;
do {
do {
min_max_queue.push(a[p2]);
++p2;
} while (p2 < N && min_max_queue.range() <= R);
if (p2 < N) {
++v[p2 - p1 - 1];
min_max_queue.pop();
++p1;
}
} while (p2 < N);
for (i = 1; i <= N-p1; ++i) {
++v[i];
}
sum = 0;
for (j = N; j > 0; --j) {
value = v[j];
v[j] += sum;
sum += value;
}

Array: mathematical sequence

An array of integers A[i] (i > 1) is defined in the following way: an element A[k] ( k > 1) is the smallest number greater than A[k-1] such that the sum of its digits is equal to the sum of the digits of the number 4* A[k-1] .
You need to write a program that calculates the N th number in this array based on the given first element A[1] .
INPUT:
In one line of standard input there are two numbers seperated with a single space: A[1] (1 <= A[1] <= 100) and N (1 <= N <= 10000).
OUTPUT:
The standard output should only contain a single integer A[N] , the Nth number of the defined sequence.
Input:
7 4
Output:
79
Explanation:
Elements of the array are as follows: 7, 19, 49, 79... and the 4th element is solution.
I tried solving this by coding a separate function that for a given number A[k] calculates the sum of it's digits and finds the smallest number greater than A[k-1] as it says in the problem, but with no success. The first testing failed because of a memory limit, the second testing failed because of a time limit, and now i don't have any possible idea how to solve this. One friend suggested recursion, but i don't know how to set that.
Anyone who can help me in any way please write, also suggest some ideas about using recursion/DP for solving this problem. Thanks.
This has nothing to do with recursion and almost nothing with dynamic programming. You just need to find viable optimizations to make it fast enough. Just a hint, try to understand this solution:
http://codepad.org/LkTJEILz
Here is a simple solution in python. It only uses iteration, recursion is unnecessary and inefficient even for a quick and dirty solution.
def sumDigits(x):
sum = 0;
while(x>0):
sum += x % 10
x /= 10
return sum
def homework(a0, N):
a = [a0]
while(len(a) < N):
nextNum = a[len(a)-1] + 1
while(sumDigits(nextNum) != sumDigits(4 * a[len(a)-1])):
nextNum += 1
a.append(nextNum)
return a[N-1]
PS. I know we're not really supposed to give homework answers, but it appears the OP is in an intro to C++ class so probably doesn't know python yet, hopefully it just looks like pseudo code. Also the code is missing many simple optimizations which would probably make it too slow for a solution as is.
It is rather recursive.
The kernel of the problem is:
Find the smallest number N greater than K having digitsum(N) = J.
If digitsum(K) == J then test if N = K + 9 satisfies the condition.
If digitsum(K) < J then possibly N differs from K only in the ones digit (if the digitsum can be achieved without exceeding 9).
Otherwise if digitsum(K) <= J the new ones digit is 9 and the problem recurses to "Find the smallest number N' greater than (K/10) having digitsum(N') = J-9, then N = N'*10 + 9".
If digitsum(K) > J then ???
In every case N <= 4 * K
9 -> 18 by the first rule
52 -> 55 by the second rule
99 -> 189 by the third rule, the first rule is used during recursion
25 -> 100 requires the fourth case, which I had originally not seen the need for.
Any more counterexamples?