Whats the efficient way to sum up the elements of an array in following way? - c++

Suppose you are given an n sized array A and a integer k
Now you have to follow this function:
long long sum(int k)
{
long long sum=0;
for(int i=0;i<n;i++){
sum+=min(A[i],k);
}
return sum;
}
what is the most efficient way to find sum?
EDIT: if I am given m(<=100000) queries, and given a different k every time, it becomes very time consuming.

If set of queries changes with each k then you can't do better than in O(n). Your only options for optimizing is to use multiple threads (each thread sums some region of array) or at least ensure that your loop is properly vectorized by compiler (or write vectorized version manually using intrinsics).
But if set of queries is fixed and only k is changed, then you may do in O(log n) by using following optimization.
Preprocess array. This is done only once for all ks:
Sort elements
Make another array of the same length which contains partial sums
For example:
inputArray: 5 1 3 8 7
sortedArray: 1 3 5 7 8
partialSums: 1 4 9 16 24
Now, when new k is given, you need to perform following steps:
Make binary search for given k in sortedArray -- returns index of maximal element <= k
Result is partialSums[i] + (partialSums.length - i) * k

You can do way better than that if you can sort the array A[i] and have a secondary array prepared once.
The idea is:
Count how many items are less than k, and just compute the equivalent sum by the formula: count*k
Prepare an helper array which will give you the sum of the items superior to k directly
Preparation
Step 1: sort the array
std::sort(begin(A), end(A));
Step 2: prepare an helper array
std::vector<long long> p_sums(A.size());
std::partial_sum(rbegin(A), rend(A), begin(p_sums));
Query
long long query(int k) {
// first skip all items whose value is below k strictly
auto it = std::lower_bound(begin(A), end(A), k);
// compute the distance (number of items skipped)
auto index = std::distance(begin(A), it);
// do the sum
long long result = index*k + p_sums[index];
return result;
}
The complexity of the query is: O(log(N)) where N is the length of the array A.
The complexity of the preparation is: O(N*log(N)). We could go down to O(N) with a radix sort but I don't think it is useful in your case.
References
std::sort()
std::partial_sum()
std::lower_bound()

What you do seems absolutely fine. Unless this is really absolutely time critical (that is customers complain that your app is too slow and you measured it, and this function is the problem, in which case you can try some non-portable vector instructions, for example).
Often you can do things more efficiently by looking at them from a higher level. For example, if I write
for (n = 0; n < 1000000; ++n)
printf ("%lld\n", sum (100));
then this will take an awful long time (half a trillion additions) and can be done a lot quicker. Same if you change one element of the array A at a time and recalculate sum each time.

Suppose there are x elements of array A which are no larger than k and set B contains those elements which are larger than k and belongs to A.
Then the result of function sum(k) equals
k * x + sum_b
,where sum_b is the sum of elements belonging to B.
You can firstly sort the the array A, and calculate the array pre_A, where
pre_A[i] = pre_A[i - 1] + A[i] (i > 0),
or 0 (i = 0);
Then for each query k, use binary search on A to find the largest element u which is no larger than k. Assume the index of u is index_u, then sum(k) equals
k * index_u + pre_A[n] - pre_A[index_u]
. The time complex for each query is log(n).
In case array A may be dynamically changed, you can use BST to handle it.

Related

Efficient algorithm to produce closest triplet from 3 arrays?

I need to implement an algorithm in C++ that, when given three arrays of unequal sizes, produces triplets a,b,c (one element contributed by each array) such that max(a,b,c) - min(a,b,c) is minimized. The algorithm should produce a list of these triplets, in order of size of max(a,b,c)-min(a,b,c). The arrays are sorted.
I've implemented the following algorithm (note that I now use arrays of type double), however it runs excruciatingly slow (even when compiled using GCC with -03 optimization, and other combinations of optimizations). The dataset (and, therefore, each array) has potentially tens of millions of elements. Is there a faster/more efficient method? A significant speed increase is necessary to accomplish the required task in a reasonable time frame.
void findClosest(vector<double> vec1, vector<double> vec2, vector<double> vec3){
//calculate size of each array
int len1 = vec1.size();
int len2 = vec2.size();
int len3 = vec3.size();
int i = 0; int j = 0; int k = 0; int res_i, res_j, res_k;
int diff = INT_MAX;
int iter = 0; int iter_bound = min(min(len1,len2),len3);
while(iter < iter_bound)
while(i < len1 && j < len2 && k < len3){
int minimum = min(min(vec1[i], vec2[j]), vec3[k]);
int maximum = max(max(vec1[i], vec2[j]), vec3[k]);
//if new difference less than previous difference, update difference, store
//resultants
if(fabs(maximum - minimum) < diff){ diff = maximum-minimum; res_i = i; res_j = j; res_k = k;}
//increment minimum value
if(vec1[i] == minimum) ++i;
else if(vec2[j] == minimum) ++j;
else ++k;
}
//"remove" triplet
vec1.erase(vec1.begin() + res_i);
vec2.erase(vec2.begin() + res_j);
vec3.erase(vec3.begin() + res_k);
--len1; --len2; --len3;
++iter_bound;
}
OK, you're going to need to be clever in a few ways to make this run well.
The first thing that you need is a priority queue, which is usually implemented with a heap. With that, the algorithm in pseudocode is:
Make a priority queue for possible triples in order of max - min, then how close median is to their average.
Make a pass through all 3 arrays, putting reasonable triples for every element into the priority queue
While the priority queue is not empty:
Pull a triple out
If all three of the triple are not used:
Add triple to output
Mark the triple used
else:
If you can construct reasonable triplets for unused elements:
Add them to the queue
Now for this operation to succeed, you need to efficiently find elements that are currently unused. Doing that at first is easy, just keep an array of bools where you mark off the indexes of the used values. But once a lot have been taken off, your search gets long.
The trick for that is to have a vector of bools for individual elements, a second for whether both in a pair have been used, a third for where all 4 in a quadruple have been used and so on. When you use an element just mark the individual bool, then go up the hierarchy, marking off the next level if the one you're paired with is marked off, else stopping. This additional data structure of size 2n will require an average of marking 2 bools per element used, but allows you to find the next unused index in either direction in at most O(log(n)) steps.
The resulting algorithm will be O(n log(n)).

Efficiently find an integer not in a set of size 40, 400, or 4000

Related to the classic problem find an integer not among four billion given ones but not exactly the same.
To clarify, by integers what I really mean is only a subset of its mathemtical definition. That is, assume there are only finite number of integers. Say in C++, they are int in the range of [INT_MIN, INT_MAX].
Now given a std::vector<int> (no duplicates) or std::unordered_set<int>, whose size can be 40, 400, 4000 or so, but not too large, how to efficiently generate a number that is guaranteed to be not among the given ones?
If there is no worry for overflow, then I could multiply all nonzero ones together and add the product by 1. But there is. The adversary test cases could delibrately contain INT_MAX.
I am more in favor of simple, non-random approaches. Is there any?
Thank you!
Update: to clear up ambiguity, let's say an unsorted std::vector<int> which is guaranteed to have no duplicates. So I am asking if there is anything better than O(n log(n)). Also please note that test cases may contain both INT_MIN and INT_MAX.
You could just return the first of N+1 candidate integers not contained in your input. The simplest candidates are the numbers 0 to N. This requires O(N) space and time.
int find_not_contained(container<int> const&data)
{
const int N=data.size();
std::vector<char> known(N+1, 0); // one more candidates than data
for(int i=0; i< N; ++i)
if(data[i]>=0 && data[i]<=N)
known[data[i]]=1;
for(int i=0; i<=N; ++i)
if(!known[i])
return i;
assert(false); // should never be reached.
}
Random methods can be more space efficient, but may require more passes over the data in the worst case.
Random methods are indeed very efficient here.
If we want to use a deterministic method and by assuming the size n is not too large, 4000 for example, then we can create a vector x of size m = n + 1 (or a little bit larger, 4096 for example to facilitate calculation), initialised with 0.
For each i in the range, we just set x[array[i] modulo m] = 1.
Then a simple O(n) search in x will provide a value which is not in array
Note: the modulo operation is not exactly the "%" operation
Edit: I mentioned that calculations are made easier by selecting here a size of 4096. To be more concrete, this implies that the modulo operation is performed with a simple & operation
You can find the smallest unused integer in O(N) time using O(1) auxiliary space if you are allowed to reorder the input vector, using the following algorithm. [Note 1] (The algorithm also works if the vector contains repeated data.)
size_t smallest_unused(std::vector<unsigned>& data) {
size_t N = data.size(), scan = 0;
while (scan < N) {
auto other = data[scan];
if (other < scan && data[other] != other) {
data[scan] = data[other];
data[other] = other;
}
else
++scan;
}
for (scan = 0; scan < N && data[scan] == scan; ++scan) { }
return scan;
}
The first pass guarantees that if some k in the range [0, N) was found after position k, then it is now present at position k. This rearrangement is done by swapping in order to avoid losing data. Once that scan is complete, the first entry whose value is not the same as its index is not referenced anywhere in the array.
That assertion may not be 100% obvious, since a entry could be referenced from an earlier index. However, in that case the entry could not be the first entry unequal to its index, since the earlier entry would be meet that criterion.
To see that this algorithm is O(N), it should be observed that the swap at lines 6 and 7 can only happen if the target entry is not equal to its index, and that after the swap the target entry is equal to its index. So at most N swaps can be performed, and the if condition at line 5 will be true at most N times. On the other hand, if the if condition is false, scan will be incremented, which can also only happen N times. So the if statement is evaluated at most 2N times (which is O(N)).
Notes:
I used unsigned integers here because it makes the code clearer. The algorithm can easily be adjusted for signed integers, for example by mapping signed integers from [INT_MIN, 0) onto unsigned integers [INT_MAX, INT_MAX - INT_MIN) (The subtraction is mathematical, not according to C semantics which wouldn't allow the result to be represented.) In 2's-complement, that's the same bit pattern. That changes the order of the numbers, of course, which affects the semantics of "smallest unused integer"; an order-preserving mapping could also be used.
Make random x (INT_MIN..INT_MAX) and test it against all. Test x++ on failure (very rare case for 40/400/4000).
Step 1: Sort the vector.
That can be done in O(n log(n)), you can find a few different algorithms online, use the one you like the most.
Step 2: Find the first int not in the vector.
Easily iterate from INT_MIN to INT_MIN + 40/400/4000 checking if the vector has the current int:
Pseudocode:
SIZE = 40|400|4000 // The one you are using
for (int i = 0; i < SIZE; i++) {
if (array[i] != INT_MIN + i)
return INT_MIN + i;
The solution would be O(n log(n) + n) meaning: O(n log(n))
Edit: just read your edit asking for something better than O(n log(n)), sorry.
For the case in which the integers are provided in an std::unordered_set<int> (as opposed to a std::vector<int>), you could simply traverse the range of integer values until you come up against one integer value that is not present in the unordered_set<int>. Searching for the presence of an integer in an std::unordered_set<int> is quite straightforward, since std::unodered_set does provide searching through its find() member function.
The space complexity of this approach would be O(1).
If you start traversing at the lowest possible value for an int (i.e., std::numeric_limits<int>::min()), you will obtain the lowest int not contained in the std::unordered_set<int>:
int find_lowest_not_contained(const std::unordered_set<int>& set) {
for (auto i = std::numeric_limits<int>::min(); ; ++i) {
auto it = set.find(i); // search in set
if (it == set.end()) // integer not in set?
return *it;
}
}
Analogously, if you start traversing at the greatest possible value for an int (i.e., std::numeric_limits<int>::max()), you will obtain the lowest int not contained in the std::unordered_set<int>:
int find_greatest_not_contained(const std::unordered_set<int>& set) {
for (auto i = std::numeric_limits<int>::max(); ; --i) {
auto it = set.find(i); // search in set
if (it == set.end()) // integer not in set?
return *it;
}
}
Assuming that the ints are uniformly mapped by the hash function into the unordered_set<int>'s buckets, a search operation on the unordered_set<int> can be achieved in constant time. The run-time complexity would then be O(M ), where M is the size of the integer range you are looking for a non-contained value. M is upper-bounded by the size of the unordered_set<int> (i.e., in your case M <= 4000).
Indeed, with this approach, selecting any integer range whose size is greater than the size of the unordered_set, is guaranteed to come up against an integer value which is not present in the unordered_set<int>.

Merging K Sorted Arrays/Vectors Complexity

While looking into the problem of merging k sorted contiguous arrays/vectors and how it differs in implementation from merging k sorted linked lists I found two relatively easy naive solutions for merging k contiguous arrays and a nice optimized method based off of pairwise-merging that simulates how mergeSort() works. The two naive solutions I implemented seem to have the same complexity, but in a big randomized test I ran it seems one is way more inefficient than the other.
Naive merging
My naive merging method works as follows. We create an output vector<int> and set it to the first of k vectors we are given. We then merge in the second vector, then the third, and so on. Since a typical merge() method that takes in two vectors and returns one is asymptotically linear in both space and time to the number of elements in both vectors the total complexity will be O(n + 2n + 3n + ... + kn) where n is the average number of elements in each list. Since we're adding 1n + 2n + 3n + ... + kn I believe the total complexity is O(n*k^2). Consider the following code:
vector<int> mergeInefficient(const vector<vector<int> >& multiList) {
vector<int> finalList = multiList[0];
for (int j = 1; j < multiList.size(); ++j) {
finalList = mergeLists(multiList[j], finalList);
}
return finalList;
}
Naive selection
My second naive solution works as follows:
/**
* The logic behind this algorithm is fairly simple and inefficient.
* Basically we want to start with the first values of each of the k
* vectors, pick the smallest value and push it to our finalList vector.
* We then need to be looking at the next value of the vector we took the
* value from so we don't keep taking the same value. A vector of vector
* iterators is used to hold our position in each vector. While all iterators
* are not at the .end() of their corresponding vector, we maintain a minValue
* variable initialized to INT_MAX, and a minValueIndex variable and iterate over
* each of the k vector iterators and if the current iterator is not an end position
* we check to see if it is smaller than our minValue. If it is, we update our minValue
* and set our minValue index (this is so we later know which iterator to increment after
* we iterate through all of them). We do a check after our iteration to see if minValue
* still equals INT_MAX. If it has, all iterators are at the .end() position, and we have
* exhausted every vector and can stop iterative over all k of them. Regarding the complexity
* of this method, we are iterating over `k` vectors so long as at least one value has not been
* accounted for. Since there are `nk` values where `n` is the average number of elements in each
* list, the time complexity = O(nk^2) like our other naive method.
*/
vector<int> mergeInefficientV2(const vector<vector<int> >& multiList) {
vector<int> finalList;
vector<vector<int>::const_iterator> iterators(multiList.size());
// Set all iterators to the beginning of their corresponding vectors in multiList
for (int i = 0; i < multiList.size(); ++i) iterators[i] = multiList[i].begin();
int k = 0, minValue, minValueIndex;
while (1) {
minValue = INT_MAX;
for (int i = 0; i < iterators.size(); ++i){
if (iterators[i] == multiList[i].end()) continue;
if (*iterators[i] < minValue) {
minValue = *iterators[i];
minValueIndex = i;
}
}
iterators[minValueIndex]++;
if (minValue == INT_MAX) break;
finalList.push_back(minValue);
}
return finalList;
}
Random simulation
Long story short, I built a simple randomized simulation that builds a multidimensional vector<vector<int>>. The multidimensional vector starts with 2 vectors each of size 2, and ends up with 600 vectors each of size 600. Each vector is sorted, and the sizes of the larger container and each child vector increase by two elements every iteration. I time how long it takes for each algorithm to perform like this:
clock_t clock_a_start = clock();
finalList = mergeInefficient(multiList);
clock_t clock_a_stop = clock();
clock_t clock_b_start = clock();
finalList = mergeInefficientV2(multiList);
clock_t clock_b_stop = clock();
I then built the following plot:
My calculations say the two naive solutions (merging and selecting) both have the same time complexity but the above plot shows them as very different. At first I rationalized this by saying there may be more overhead in one vs the other, but then realized that the overhead should be a constant factor and not produce a plot like the following. What is the explanation for this? I assume my complexity analysis is wrong?
Even if two algorithms have the same complexity (O(nk^2) in your case) they may end up having enormously different running times depending upon your size of input and the 'constant' factors involved.
For example, if an algorithm runs in n/1000 time and another algorithm runs in 1000n time, they both have the same asymptotic complexity but they shall have very different running times for 'reasonable' choices of n.
Moreover, there are effects caused by caching, compiler optimizations etc that may change the running time significantly.
For your case, although your calculation of complexities seem to be correct, but in the first case, the actual running time shall be (nk^2 + nk)/2 whereas in the second case, the running time shall be nk^2. Notice that the division by 2 may be significant because as k increases the nk term shall be negligible.
For a third algorithm, you can modify the Naive selection by maintaining a heap of k elements containing the first elements of all the k vectors. Then your selection process shall take O(logk) time and hence the complexity shall reduce to O(nklogk).

How can I find number of consecutive sequences of various lengths satisfy a particular property?

I am given a array A[] having N elements which are positive integers
.I have to find the number of sequences of lengths 1,2,3,..,N that satisfy a particular property?
I have built an interval tree with O(nlogn) complexity.Now I want to count the number of sequences that satisfy a certain property ?
All the properties required for the problem are related to sum of the sequences
Note an array will have N*(N+1)/2 sequences. How can I iterate over all of them in O(nlogn) or O(n) ?
If we let k be the moving index from 0 to N(elements), we will run an algorithm that is essentially looking for the MIN R that satisfies the condition (lets say I), then every other subset for L = k also is satisfied for R >= I (this is your short circuit). After you find I, simply return an output for (L=k, R>=I). This of course assumes that all numerics in your set are >= 0.
To find I, for every k, begin at element k + (N-k)/2. Figure out if this defined subset from (L=k, R=k+(N-k)/2) satisfies your condition. If it does, then decrement R until your condition is NOT met, then R=1 is your MIN (your could choose to print these results as you go, but they results in these cases would be essentially printed backwards). If (L=k, R=k+(N-k)/2) does not satisfy your condition, then INCREMENT R until it does, and this becomes your MIN for that L=k. This degrades your search space for each L=k by a factor of 2. As k increases and approaches N, your search space continuously decreases.
// This declaration wont work unless N is either a constant or MACRO defined above
unsigned int myVals[N];
unsigned int Ndiv2 = N / 2;
unsigned int R;
for(unsigned int k; k < N; k++){
if(TRUE == TESTVALS(myVals, k, Ndiv2)){ // It Passes
for(I = NDiv2; I>=k; I--){
if(FALSE == TESTVALS(myVals, k, I)){
I++;
break;
}
}
}else{ // It Didnt Pass
for(I = NDiv2; I>=k; I++){
if(TRUE == TESTVALS(myVals, k, I)){
break;
}
}
}
// PRINT ALL PAIRS from L=k, from R=I to R=N-1
if((k & 0x00000001) == 0) Ndiv2++;
} // END --> for(unsigned int k; k < N; k++)
The complexity of the algorithm above is O(N^2). This is because for each k in N(i.e. N iterations / tests) there is no greater than N/2 values for each that need testing. Big O notation isnt concerned about the N/2 nor the fact that truly N gets smaller as k grows, it is concerned with really only the gross magnitude. Thus it would say N tests for every N values thus O(N^2)
There is an Alternative approach which would be FASTER. That approach would be to whenever you wish to move within the secondary (inner) for loops, you could perform a move have the distance algorithm. This would get you to your O(nlogn) set of steps. For each k in N (which would all have to be tested), you run this half distance approach to find your MIN R value in logN time. As an example, lets say you have a 1000 element array. when k = 0, we essentially begin the search for MIN R at index 500. If the test passes, instead of linearly moving downward from 500 to 0, we test 250. Lets say the actual MIN R for k = 0 is 300. Then the tests to find MIN R would look as follows:
R=500
R=250
R=375
R=312
R=280
R=296
R=304
R=300
While this is oversimplified, your are most likely going to have to optimize, and test 301 as well 299 to make sure youre in the sweet spot. Another not is to be careful when dividing by 2 when you have to move in the same direction more than once in a row.
#user1907531: First of all , if you are participating in an online contest of such importance at national level , you should refrain from doing this cheap tricks and methodologies to get ahead of other deserving guys. Second, a cheater like you is always a cheater but all this hampers the hard work of those who have put in making the questions and the competitors who are unlike you. Thirdly, if #trumetlicks asks you why haven't you tagged the ques as homework , you tell another lie there.And finally, I don't know how could so many people answer this question this cheater asked without knowing the origin/website/source of this question. This surely can't be given by a teacher for homework in any Indian school. To tell everyone this cheater has asked you the complete solution of a running collegiate contest in India 6 hours before the contest ended and he has surely got a lot of direct helps and top of that invited 100's others to cheat from the answers given here. So, good luck to all these cheaters .

Given an array of N numbers,find the number of sequences of all lengths having the range of R?

This is a follow up question to Given a sequence of N numbers ,extract number of sequences of length K having range less than R?
I basically need a vector v as an answer of size N such that V[i] denotes number of sequences of length i which have range <=R.
Traditionally, in recursive solutions, you would compute the solution for K = 0, K = 1, and then find some kind of recurrence relation between subsequent elements to avoid recomputing the solution from scratch each time.
However here I believe that maybe attacking the problem from the other side would be interesting, because of the property of the spread:
Given a sequence of spread R (or less), any subsequence has a spread inferior to R as well
Therefore, I would first establish a list of the longest subsequences of spread R beginning at each index. Let's call this list M, and have M[i] = j where j is the higher index in S (the original sequence) for which S[j] - S[i] <= R. This is going to be O(N).
Now, for any i, the number of sequences of length K starting at i is either 0 or 1, and this depends whether K is greater than M[i] - i or not. A simple linear pass over M (from 0 to N-K) gives us the answer. This is once again O(N).
So, if we call V the resulting vector, with V[k] denoting the number of subsequences of length K in S with spread inferior to R, then we can do it in a single iteration over M:
for i in [0, len(M)]:
for k in [0, M[i] - i]:
++V[k]
The algorithm is simple, however the number of updates can be rather daunting. In the worst case, supposing than M[i] - i equals N - i, it is O(N*N) complexity. You would need a better data structure (probably an adaptation of a Fenwick Tree) to use this algorithm an lower the cost of computing those numbers.
If you are looking for contiguous sequences, try doing it recursively : The K-length subsequences set having a range inferior than R are included in the (K-1)-length subsequences set.
At K=0, you have N solutions.
Each time you increase K, you append (resp. prepend) the next (resp.previous) element, check if it the range is inferior to R, and either store it in a set (look for duplicates !) or discard it depending on the result.
If think the complexity of this algorithm is O(n*n) in the worst-case scenario, though it may be better on average.
I think Matthieu has the right answer when looking for all sequences with spread R.
As you are only looking for sequences of length K, you can do a little better.
Instead of looking at the maximum sequence starting at i, just look at the sequence of length K starting at i, and see if it has range R or not. Do this for every i, and you have all sequences of length K with spread R.
You don't need to go through the whole list, as the latest start point for a sequence of length K is n-K+1. So complexity is something like (n-K+1)*K = n*K - K*K + K. For K=1 this is n,
and for K=n it is n. For K=n/2 it is n*n/2 - n*n/4 + n/2 = n*n/2 + n/2, which I think is the maximum. So while this is still O(n*n), for most values of K you get a little better.
Start with a simpler problem: count the maximal length of sequences, starting at each index and having the range, equal to R.
To do this, let first pointer point to the first element of the array. Increase second pointer (also starting from the first element of the array) while sequence between pointers has the range, less or equal to R. Push every array element, passed by second pointer, to min-max-queue, made of a pair of mix-max-stacks, described in this answer. When difference between max and min values, reported by min-max-queue exceeds R, stop increasing second pointer, increment V[ptr2-ptr1], increment first pointer (removing element, pointed by it, from min-max-queue), and continue increasing second pointer (keeping range under control).
When second pointer leaves bounds of the array, increment V[N-ptr1] for all remaining ptr1 (corresponding ranges may be less or equal to R). To add all other ranges, that are less than R, compute cumulative sum of array V[], starting from its end.
Both time and space complexities are O(N).
Pseudo-code:
p1 = p2 = 0;
do {
do {
min_max_queue.push(a[p2]);
++p2;
} while (p2 < N && min_max_queue.range() <= R);
if (p2 < N) {
++v[p2 - p1 - 1];
min_max_queue.pop();
++p1;
}
} while (p2 < N);
for (i = 1; i <= N-p1; ++i) {
++v[i];
}
sum = 0;
for (j = N; j > 0; --j) {
value = v[j];
v[j] += sum;
sum += value;
}