To find the min and max after addition and subtraction from a range of numbers - c++

I am having a Algorithm question, in which numbers are been given from 1 to N and a number of operations are to be performed and then min/max has to be found among them.
Two operations - Addition and subtraction
and operations are in the form a b c d , where a is the operation to be performed,b is the starting number and c is the ending number and d is the number to be added/subtracted
for example
suppose numbers are 1 to N
and
N =5
1 2 3 4 5
We perform operations as
1 2 4 5
2 1 3 4
1 4 5 6
By these operations we will have numbers from 1 to N as
1 7 8 9 5
-3 3 4 9 5
-3 3 4 15 11
So the maximum is 15 and min is -3
My Approach:
I have taken the lower limit and upper limit of the numbers in this case it is 1 and 5 only stored in an array and applied the operations, and then had found the minimum and maximum.
Could there be any better approach?

I will assume that all update (addition/subtraction) operations happen before finding max/min. I don't have a good solution for update and min/max operations mixing together.
You can use a plain array, where the value at index i of the array is the difference between the index i and index (i - 1) of the original array. This makes the sum from index 0 to index i of our array to be the value at index i of the original array.
Subtraction is addition with the negated number, so they can be treated similarly. When we need to add k to the original array from index i to index j, we will add k to index i of our array, and subtract k to index (j + 1) of our array. This takes O(1) time per update.
You can find the min/max of the original array by accumulating summing the values and record the max/min values. This takes O(n) time per operation. I assume this is done once for the whole array.
Pseudocode:
a[N] // Original array
d[N] // Difference array
// Initialization
d[0] = a[0]
for (i = 1 to N-1)
d[i] = a[i] - a[i - 1]
// Addition (subtraction is similar)
add(from_idx, to_idx, amount) {
d[from_idx] += amount
d[to_idx + 1] -= amount
}
// Find max/min for the WHOLE array after add/subtract
current = max = min = d[0];
for (i = 1 to N - 1) {
current += d[i]; // Sum from d[0] to d[i] is a[i]
max = MAX(max, current);
min = MIN(min, current);
}

Generally there is no "best way" to find the min/max in the performance point of view because it depends on how this application will be used.
-Finding the max and min in a list needs O(n) Time, so if you want to run many (many in the context of the input) operations, your approach to find the min/max after all the operations took place is fine.
-But if the list will hold many elements and you don’t want to run that many operations, you better check each result of the op if its a new max/min and update if necessary.

Related

Counting inversion after swapping two elements of array

You are given a permutation p1,p2,...,pn of numbers from 1 to n.
A permutation is a sequence of integers from 1 to n of length n containing each number exactly once.
You are given q queries where each query consists of two integers a and b, In response to each query you need to return a number of inversions of permutation after swapping elements at index a and b, Here every query is independent i.e. after each query the permutation is restored to its initial state.
An inversion in a permutation p is a pair of indices (i, j) such that i > j and pi < pj. For example, a permutation [4, 1, 3, 2] contains 4 inversions: (2, 1), (3, 1), (4, 1), (4, 3).
Input: The first line contains n,q.
The second line contains the space-separated permutation p1,p2,...,pn.
Each line of the next q lines contains two integers a,b.
Output: For each query Print an integer denoting the number of Inversion on a new line.
Sample input:
5 5
1 2 3 4 5
1 2
1 3
2 5
2 4
3 3
Output:
1
3
5
3
0
Constraints:
2<=n<=1000
1<=q<=200000
My approach: I am counting no of inversions using BIT (https://www.geeksforgeeks.org/count-inversions-array-set-3-using-bit/) for each query after swapping elements at position a and b..and then again swapping it so that my array remains unchanged. But this solution gives TLE for large test cases. Is there any better approach for this problem?
You are getting TLE probably because number of computations in this approach is q * (n * log(n)) = 2 * 10^5 * 10^3 * log(1000) = ~10^9, which is more than generally accepted computations ~10^8.
I can think of the following solution. Please note that I have not coded / verified it:
Denoting ri == number of indices j, such that i > j && pi < pj. Eg: [2, 3, 1, 4], r3 = 2. Basically, it means the number of inversions with the farther index as i. (Please note that I am using 1-based index as per the question. Also,a < b as per the question)
Thus we have: Sum of ri == #invs (number of inversions)
We can calculate initial total #invs in O(n^2)
When a and b are swapped, we can observe that:
a) ri remains constant, where i < a .
b) ri remains constant, where i > b.
Only ri changes where a <= i <=b, and that too on these following conditions. I am considering the case when pa < pb. Exact opposite case will need to considered when pa > pb.
a) Since pa < pb, thus this swap causes #invs = #invs + 1
b) If (pi < pa && pi < pb) || (pi > pa && pi > pb), this swap does not change ri. Eg: [2,....10,....5]. Here Swapping 2 and 5 does not change the r value for 10.
c) If pa < pi < pb, it will increment ri by 1, and new rb by 1. Eg: [2,....3,.....4], when 2 and 4 are swapped, we have [4,....3,....2], the rvalue 3 increases by 1 (because of 4); and also the r value of 2 increase by 1 (because of 3). Please note that increment because of what about 4 > 2? was already calculated in step (a), and needs to be done once only.
d) We need to find all such indicies i where pa < pi < pb as we started with above. Let us call it f(a, b). Then the total change in #invs delta = (2 * f(a, b)) + 1, and answer will be #original_invs + delta.
As I mentioned, all the exact opposite steps need to be done for the case pa > pb. The delta will be negative in that case.
Now, the only thing remained is to solve: Given a, b, find f(a, b) efficiently. For this, we can pre-process and store it for all pairs of indices. This will take O(N^2) space, and O(N^2 * log(N)) time, using a balanced binary-search-tree (BST). Again showing steps for pre-processing for case pa < pb only. Another set of pre-processing steps needs to be done for the other case:
We will use self-balancing BST, in which each node also contains the following fields:
a) field_1: This denotes the size of the left sub-tree. This value will be updated on every insert operation, if size of left-sub-tree changes.
b) field_2: This denotes the number of elements < node.value that this tree has. This value is initialized once when the node is inserted and does not change thereafter. I have added a small explanation of how it will be achieved in Addendum-A. This field is basically our pre-processing, which will determine f(a, b).
With all of this now, for each index i, where 0 <= i < n, do the following: Create new tree. Insert pj values into the tree one by one, where (i < j < n ) && (pa < pj) . (Please note we are not inserting values where pa > pj). The method given in Addendum-A will make sure we find f(i, j) while inserting.
There will be n such pre-processed trees, one for every index. For finding f(a, b): We need to look into ath tree, and search node.value = pb. This node's field_2 = f(a, b).
The complexity of insertion is O(logN). So, the total pre-processing computation = O(N * N(logN)). Search is O(logN), so the query complexity is O(q * logN). Total complexity = O(N^2) + O(N * N (logN)) + O(q * logN) which will turn out ~10^7
==============================================================================
Addendum A: How to populate field_2 while inserting node:
i) Insert the node, and balance the tree. Update field_1 as required.
i) Initailze ans = 0. Traverse the BST from root searching for your node.
iii) do {
If node.value < search_key_b, ans += node.left_subtree_size + 1
} while(!node.found)
iv) ans -= 1
We can solve this in O(n log n) space and O(n log n + Q * log^2(n)) time with a merge-sort tree. The merge-sort tree allows us to find the number of elements inside a subarray that are greater than or lower than an input number in O(log^2(n)) time and O(n log n) space.
First we record the total number of inversions in O(n log n) time, for which there are known methods. To query the effect of a swap bound by left and right, consider the subarray between:
subtract the number of elements greater
than right in the subarray (those will
no longer be inversions)
subtract the number of elements smaller
than left in the subarray (those will
no longer be inversions)
add the number of elements greater
than left in the subarray (those will
be new inversions)
add the number of elements smaller
than right in the subarray (those will
be new inversions)
if right > left, add 1
if left > right, subtract 1

Populating a vector with numbers and conditions in C++

Working on a business class assignment where we're using Excel to solve a problem with the following setup and conditions, but I wanted to find solutions by writing some code in C++ which is what I'm most familiar from some school courses.
We have 4 stores where we need to invest 10 million dollars. The main conditions are:
It is necessary to invest at least 1mil per store.
The investments in the 4 stores must total 10 million.
Following the rules above, the most one can invest in a single store is 7 million
Each store has its own unique return of investment percentages based off the amount of money invested per store.
In other words, there is a large number of combinations that can be obtained by investing in each store. Repetition of numbers does not matter as long as the total is 10 per combination, but the order of the numbers does matter.
If my math is right, the total number of combinations is 7^4 = 2401, but the number of working solutions
is lesser due to the condition that each combination must equal 10 as a sum.
What I'm trying to do in C++ is use loops to populate each row with 4 numbers such that their sum equals 10 (millions), for example:
7 1 1 1
1 7 1 1
1 1 7 1
1 1 1 7
6 2 1 1
6 1 2 1
6 1 1 2
5 3 1 1
5 1 3 1
5 1 1 3
5 1 2 2
5 2 1 2
5 2 2 1
I'd appreciate advice on how to tackle this. Still not quite sure if using loops is a good idea whilst using an array (2D Array/Vector perhaps?) I've a vague idea that maybe recursive functions would facilitate a solution.
Thanks for taking some time to read, I appreciate any and all advice for coming up with solutions.
Edit:
Here's some code I worked on to just get 50 rows of numbers randomized. Still have to implement the conditions where valid row combinations must be the sum total of 10 between the 4;
int main(){
const int rows = 50;
int values[rows][4];
for (int i = 0; i < 50; i++) {
for (int j = 0; j <= 3; j++){
values[i][j]= (rand() % 7 + 1);
cout << values[i][j] << " ";
}
cout << endl;
}
}
You can calculate this recursively. For each level, you have:
A target sum
The number of elements in that level
The minimum value each individual element can have
First, we determine our return type. What's your final output? Looks like a vector of vectors to me. So our recursive function will return a the same.
Second, we determine the result of our degenerate case (at the "bottom" of the recursion), when the number of elements in this level is 1.
std::vector<std::vector<std::size_t>> recursive_combinations(std::size_t sum, std::size_t min_val, std::size_t num_elements)
{
std::vector<std::vector<std::size_t>> result {};
if (num_elements == 1)
{
result.push_back(std::vector<std::size_t>{sum});
return result;
}
...non-degenerate case goes here...
return result;
}
Next, we determine what happens when this level has more than 1 element in it. Split the sum into all possible pairs of the "first" element and the "remaining" group. e.g., if we have a target sum of 5, 3 num_elements, and a min_val of 1, we'd generate the pairs {1,4}, {2,3}, and {3,2}, where the first number in each pair is for the first element, and the second number in each pair is the remaining sum left over for the remaining group.
Recursively call the recursive_combinations function using this second number as the new sum, and num_elements - 1 as the new num_elements to find the vector of vectors for the remaining group, and for each vector in the return vector, append the first element from the above set.

Finding the permutation that satisfy given condition

I want to find out the number of all permutation of nnumber.Number will be from 1 to n.The given condition is that each ithposition can have number up to Si,where Si is given for each position of number.
1<=n<=10^6
1<=si<=n
For example:
n=5
then its all five element will be
1,2,3,4,5
and given Si for each position is as:
2,3,4,5,5
It shows that at:
1st position can have 1 to 2that is 1,2 but can not be number among 3 to 5.
Similarly,
At 2nd position can have number 1 to 3 only.
At 3rd position can have number 1 to 4 only.
At 4th position can have number 1 to 5 only.
At 5th position can have number 1 to 5 only.
Some of its permutation are:
1,2,3,4,5
2,3,1,4,5
2,3,4,1,5 etc.
But these can not be:
3,1,4,2,5 As 3 is present at 1st position.
1,2,5,3,4 As 5 is present at 3rd position.
I am not getting any idea to count all possible number of permutations with given condition.
Okay, if we have a guarantee that numbers si are given in not descending order then looks like it is possible to calculate the number of permutations in O(n).
The idea of straightforward algorithm is as follows:
At step i multiply the result by current value of si[i];
We chose some number for position i. As long as we need permutation, that number cannot be repeated, so decrement all the rest si[k] where k from i+1 to the end (e.g. n) by 1;
Increase i by 1, go back to (1).
To illustrate on example for si: 2 3 3 4:
result = 1;
current si is "2 3 3 4", result *= si[0] (= 1*2 == 2), decrease 3, 3 and 4 by 1;
current si is "..2 2 3", result *= si[1] (= 2*2 == 4), decrease last 2 and 3 by 1;
current si is "....1 2", result *= si[2] (= 4*1 == 4), decrease last number by 1;
current si is "..... 1", result *= si[3] (= 4*1 == 4), done.
Hovewer this straightforward approach would require O(n^2) due to decreasing steps. To optimize it we can easily observe that at the moment of result *= si[i] our si[i] was already decreased exactly i times (assuming we start from 0 of course).
Thus O(n) way:
unsigned int result = 1;
for (unsigned int i = 0; i < n; ++i)
{
result *= (si[i] - i);
}
for each si count the number of element in your array such that a[i] <= si using binary search, and store the value to an array count[i], now the answer is the product of all count[i], however we have decrease the number of redundancy from the answer ( as same number could be count twice ), for that you can sort si and check how many number is <= s[i], then decrease that number from each count,the complexity is O(nlog(n)), hope at least I give you an idea.
To complete Yuriy Ivaskevych answer, if you don't know if the sis are in increasing order, you can sort the sis and it will also works.
And the result will be null or negative if the permutations are impossible (ex: 1 1 1 1 1)
You can try backtracking, it's a little hardcore approach but will work.
try:
http://www.thegeekstuff.com/2014/12/backtracking-example/
or google backtracking tutorial C++

Similar to subset sum [duplicate]

This problem was asked to me in Amazon interview -
Given a array of positive integers, you have to find the smallest positive integer that can not be formed from the sum of numbers from array.
Example:
Array:[4 13 2 3 1]
result= 11 { Since 11 was smallest positive number which can not be formed from the given array elements }
What i did was :
sorted the array
calculated the prefix sum
Treverse the sum array and check if next element is less than 1
greater than sum i.e. A[j]<=(sum+1). If not so then answer would
be sum+1
But this was nlog(n) solution.
Interviewer was not satisfied with this and asked a solution in less than O(n log n) time.
There's a beautiful algorithm for solving this problem in time O(n + Sort), where Sort is the amount of time required to sort the input array.
The idea behind the algorithm is to sort the array and then ask the following question: what is the smallest positive integer you cannot make using the first k elements of the array? You then scan forward through the array from left to right, updating your answer to this question, until you find the smallest number you can't make.
Here's how it works. Initially, the smallest number you can't make is 1. Then, going from left to right, do the following:
If the current number is bigger than the smallest number you can't make so far, then you know the smallest number you can't make - it's the one you've got recorded, and you're done.
Otherwise, the current number is less than or equal to the smallest number you can't make. The claim is that you can indeed make this number. Right now, you know the smallest number you can't make with the first k elements of the array (call it candidate) and are now looking at value A[k]. The number candidate - A[k] therefore must be some number that you can indeed make with the first k elements of the array, since otherwise candidate - A[k] would be a smaller number than the smallest number you allegedly can't make with the first k numbers in the array. Moreover, you can make any number in the range candidate to candidate + A[k], inclusive, because you can start with any number in the range from 1 to A[k], inclusive, and then add candidate - 1 to it. Therefore, set candidate to candidate + A[k] and increment k.
In pseudocode:
Sort(A)
candidate = 1
for i from 1 to length(A):
if A[i] > candidate: return candidate
else: candidate = candidate + A[i]
return candidate
Here's a test run on [4, 13, 2, 1, 3]. Sort the array to get [1, 2, 3, 4, 13]. Then, set candidate to 1. We then do the following:
A[1] = 1, candidate = 1:
A[1] ≤ candidate, so set candidate = candidate + A[1] = 2
A[2] = 2, candidate = 2:
A[2] ≤ candidate, so set candidate = candidate + A[2] = 4
A[3] = 3, candidate = 4:
A[3] ≤ candidate, so set candidate = candidate + A[3] = 7
A[4] = 4, candidate = 7:
A[4] ≤ candidate, so set candidate = candidate + A[4] = 11
A[5] = 13, candidate = 11:
A[5] > candidate, so return candidate (11).
So the answer is 11.
The runtime here is O(n + Sort) because outside of sorting, the runtime is O(n). You can clearly sort in O(n log n) time using heapsort, and if you know some upper bound on the numbers you can sort in time O(n log U) (where U is the maximum possible number) by using radix sort. If U is a fixed constant, (say, 109), then radix sort runs in time O(n) and this entire algorithm then runs in time O(n) as well.
Hope this helps!
Use bitvectors to accomplish this in linear time.
Start with an empty bitvector b. Then for each element k in your array, do this:
b = b | b << k | 2^(k-1)
To be clear, the i'th element is set to 1 to represent the number i, and | k is setting the k-th element to 1.
After you finish processing the array, the index of the first zero in b is your answer (counting from the right, starting at 1).
b=0
process 4: b = b | b<<4 | 1000 = 1000
process 13: b = b | b<<13 | 1000000000000 = 10001000000001000
process 2: b = b | b<<2 | 10 = 1010101000000101010
process 3: b = b | b<<3 | 100 = 1011111101000101111110
process 1: b = b | b<<1 | 1 = 11111111111001111111111
First zero: position 11.
Consider all integers in interval [2i .. 2i+1 - 1]. And suppose all integers below 2i can be formed from sum of numbers from given array. Also suppose that we already know C, which is sum of all numbers below 2i. If C >= 2i+1 - 1, every number in this interval may be represented as sum of given numbers. Otherwise we could check if interval [2i .. C + 1] contains any number from given array. And if there is no such number, C + 1 is what we searched for.
Here is a sketch of an algorithm:
For each input number, determine to which interval it belongs, and update corresponding sum: S[int_log(x)] += x.
Compute prefix sum for array S: foreach i: C[i] = C[i-1] + S[i].
Filter array C to keep only entries with values lower than next power of 2.
Scan input array once more and notice which of the intervals [2i .. C + 1] contain at least one input number: i = int_log(x) - 1; B[i] |= (x <= C[i] + 1).
Find first interval that is not filtered out on step #3 and corresponding element of B[] not set on step #4.
If it is not obvious why we can apply step 3, here is the proof. Choose any number between 2i and C, then sequentially subtract from it all the numbers below 2i in decreasing order. Eventually we get either some number less than the last subtracted number or zero. If the result is zero, just add together all the subtracted numbers and we have the representation of chosen number. If the result is non-zero and less than the last subtracted number, this result is also less than 2i, so it is "representable" and none of the subtracted numbers are used for its representation. When we add these subtracted numbers back, we have the representation of chosen number. This also suggests that instead of filtering intervals one by one we could skip several intervals at once by jumping directly to int_log of C.
Time complexity is determined by function int_log(), which is integer logarithm or index of the highest set bit in the number. If our instruction set contains integer logarithm or any its equivalent (count leading zeros, or tricks with floating point numbers), then complexity is O(n). Otherwise we could use some bit hacking to implement int_log() in O(log log U) and obtain O(n * log log U) time complexity. (Here U is largest number in the array).
If step 1 (in addition to updating the sum) will also update minimum value in given range, step 4 is not needed anymore. We could just compare C[i] to Min[i+1]. This means we need only single pass over input array. Or we could apply this algorithm not to array but to a stream of numbers.
Several examples:
Input: [ 4 13 2 3 1] [ 1 2 3 9] [ 1 1 2 9]
int_log: 2 3 1 1 0 0 1 1 3 0 0 1 3
int_log: 0 1 2 3 0 1 2 3 0 1 2 3
S: 1 5 4 13 1 5 0 9 2 2 0 9
C: 1 6 10 23 1 6 6 15 2 4 4 13
filtered(C): n n n n n n n n n n n n
number in
[2^i..C+1]: 2 4 - 2 - - 2 - -
C+1: 11 7 5
For multi-precision input numbers this approach needs O(n * log M) time and O(log M) space. Where M is largest number in the array. The same time is needed just to read all the numbers (and in the worst case we need every bit of them).
Still this result may be improved to O(n * log R) where R is the value found by this algorithm (actually, the output-sensitive variant of it). The only modification needed for this optimization is instead of processing whole numbers at once, process them digit-by-digit: first pass processes the low order bits of each number (like bits 0..63), second pass - next bits (like 64..127), etc. We could ignore all higher-order bits after result is found. Also this decreases space requirements to O(K) numbers, where K is number of bits in machine word.
If you sort the array, it will work for you. Counting sort could've done it in O(n), but if you think in a practically large scenario, range can be pretty high.
Quicksort O(n*logn) will do the work for you:
def smallestPositiveInteger(self, array):
candidate = 1
n = len(array)
array = sorted(array)
for i in range(0, n):
if array[i] <= candidate:
candidate += array[i]
else:
break
return candidate

Sum of difference of a number to an array of numbers

This is my problem.
Given an array of integers and another integer k, find the sum of differences of each element of the array and k.
For example if the array is 2, 4, 6, 8, 10 and k is 3
Sum of difference
= abs(2 - 3) + abs(4-3) + abs(6 - 3) + abs(8 - 3) + abs(10 - 3)
= 1 + 1 + 3 + 5 + 7
= 17
The array remains the same throughout and can contain up to 100000 elements and there will be 100000 different values of k to be tested. k may or may not be an element of the array. This has to be done within 1s or about 100M operations. How do I achieve this?
You can run multiple queries for sums of absolute differences in O(log N) if you add a preprocessing step which costs O(N * log N).
Sort the array, then for each item in the array store the sum of all numbers that are smaller than or equal to the corresponding item. This can be done in O(N * log N) Now you have a pair of arrays that look like this:
2 4 6 8 10 // <<== Original data
2 6 12 20 30 // <<== Partial sums
In addition, store the total T of all numbers in the array.
Now you can get sums of absolute differences by running a binary search on the original array, and using the sums from the partial sums array to compute the answer: subtract the sum of all numbers to the left of the target k from the count of numbers to the left of the target times k, then subtract the count times k from the sum to the right of the number, and add the two numbers together. The partial sum of the numbers to the right of the number can be computed by subtracting the partial sum on the left from the total T.
For k=3 binary search gets you to position 1.
Partial sum on the left is 2
Count of items on the left is 1
Partial sum on the right is (30-2)=28
Count of items on the right is 4
You compute (1*3-2) + (28-4*3) = 1 + 16 = 17
First sort the array and then compute an array that stores the sum of the prefixes of the resulting sorted array. Let's denote this array p, you can compute p in linear time so that p[i] = a[0] + a[1] + ... a[i]. Now having this array you can answer with constant complexity the question what is the sum of elements a[x] + a[x+1] + .... +a[y](i.e. with indices x to y). To do that you simply compute p[y] - p[x-1](Take special care when x is 1).
Now to answer a query of the type what is the sum of absolute differences with k, we will split the problem in two parts - what is the sum of the numbers greater than k and the numbers smaller than k. In order to compute these, perform a binary search to find the position of k in the sorted a(denote that idx), and compute the sum of the values in a before idx(denote that s) and after idx(denote that S). Now the sum of absolute differences with k is idx * k - s + S - (a.length - idx)* k. This of course is pseudo code and what I mean by a.length is the number of elements in a.
After performing a linearithmic precomputation, you will be able to answer a query with O(log(n)). Please note this approach only makes sense if you plan to perform multiple queries. If you are only going to perform a single query, you can not possibly go faster than O(n).
Just implementing dasblinkenlight's solution in "contest C++":
It does exactly as he says. Reads the values, sorts them, stores the accumulated sum in V[i].second, but here V[i] is the acumulated sum until i-1 (to simplify the algorithm). It also stores a sentinel in V[n] for cases when the query is greater than max(V).
Then, for each query, binary search for the value. In this case V[a].second is the sum of values lesser than query, V[n].second-V[a].second is the sum of values greater than it.
#include<iostream>
#include<algorithm>
#define pii pair<int, int>
using namespace std;
pii V[100001];
int main() {
int n;
while(cin >> n) {
for(int i=0; i<n; i++)
cin >> V[i].first;
sort(V, V+n);
V[0].second = 0;
for(int i=1; i<=n; i++)
V[i].second = V[i-1].first + V[i-1].second;
int k; cin >> k;
for(int i=0; i<k; i++) {
int query; cin >> query;
pii* res = upper_bound(V, V+n, pii(query, 0));
int a = res-V, b=n-(res-V);
int left = query*a-V[a].second;
int right = V[n].second-V[a].second-query*b;
cout << left+right << endl;
}
}
}
It assumes a file with a format like this:
5
10 2 8 4 6
2
3 5
Then, for each query, it answers like this:
17
13