Sum of values at common indexes in all subsets? - c++

In a recent problem where i have to sum all values at common indexes in all possible subsets of size k in array of size n.
For eg: If
array ={1,2,3}
Its subsets (k=2) will be (x [i] , x [j]) where i < j
1 2
1 3
2 3
Sum:4,8
Firstly I have used recursion (same that of generating all subsets)
int sum_index[k]={0};
void sub_set(int array[],int n,int k,int temp[],int q=0,int r=0)
{
if(q==k)
{
for(int i=0;i<k;i++)
sum_index[i]+=temp[i];
}
else
{
for(int i=r;i<n;i++)
{
temp[q]=array[i];
sub_set(value,n,k,temp,q+1,i+1);
}
}
}
Problem is its taking too much time then expected .
Then i modified it to...
void sub_set(int array[],int n,int k,int temp[],int q=0,int r=0)
{
if(q==k)
{
return;
}
else
{
for(int i=r;i<n;i++)
{
temp[q]=array[i];
sum_index[q]+=temp[q]; //or sum_index[q]+=s[i];
sub_set(value,n,k,temp,q+1,i+1);
}
}
}
Still taking too much time!!
Is there any other approach to this problem?? Or any other modification i needed that i am unaware of??

Instead of iterating through the possible sub-sets, think of it a combinatorics problem.
To use your example of k=2 and {1,2,3}, let's just look at the first value of the result. It has two 1's and one 2. The two 1's correspond to the number one element sets that can be made from {2, 3} and the one 2 corresponds to the number of one element sets that can be made from {3}. A similar arrangement exists for the one 2 and two 3's in the second element of the result and looking at the subsets of the elements that appear before the element being considered.
Things get a bit more complicated when k>2 because then you will have to look for the number of combinations of elements before and after the element being considered, but the basic premise still works. Multiply the number of possible subsets before times the number of subsets afterwards and that will tell you how many times each element contributes to the result.

A solution in O(n^2) instead of O(n!):
First a tiny (:)) bit of explanation, then some code:
I´m going to assume here that your array is sorted (if not, use std::sort first). Additionally, I´m going to work with the array values 1,2,3,4... here, if you array consists arbitrary values (like 2 8 17), you´ll have to think of it as the indices (ie. 1=>2, 2=>8 etc.)
Definition: (x choose y) means the binomial coefficient, how it is calculated is in the link too. If you have an array size a and some k for the subset size, (a choose k) is the number of permutations, eg. 3 for your example: (1,2), (1,3) and (2,3).
You want the sum for each column if you write the permutations under each other, this would be easy if you knew for each column how many times each array element occurs, ie. how many 1´s, 2´s and 3´s for the first, and how many for second column (with k=2).
Here a bigger example to explain: (1,2,3,4,5) and all possible k´s (each in one block):
1
2
3
4
5
12
13
14
15
23
24
25
34
35
45
123
124
125
134
135
145
234
235
245
345
... (didn´t write k=4)
12345
Let´s introduce column indices, 0<=c<k, ie. c=0 means the first column, c=1 the second and so on; and the array size s=5.
So, looking eg. at the k=3-block, you´ll notice that the lines beginning with 1 (column c=0) have all permutations of the values (2,3,4,5) for k=2, more generally a value x in column c has all permutations for values x+1 to s after it. The values from from x+1 to s are s-x different values, and after column c there are k-c-1 more columns. So, for a value x, you can calculate ((s-x) choose (k-c-1)).
Additionally, the first column has only the values 1,2,3, the last two numbers are not here because after this column there are two more columns.
If you do this for the first column, it works well. Eg. with value 1 in the first column of k=3 above:
count(x) = ((s-x) choose (k-c-1)) = (4 choose 2) = 6
and indeed there are six 1 there. Calculate this count for every array value, multiply x*count(x), and sum it up for every x, that´s the result for the first column.
The other columns are a tiny bit harder, because there can be multiple "permutation blocks" of the same number. To start with, the step above needs a small adjustment: You need a muliplier array somewhere, one multiplier for each array value, and in the beginning each multiplier is 1. In the calculation x*count(x) above, take x*count(x)*muliplier(x) instead.
In the k=3-example, 1 in the first column can be followed by 2,3,4, 2 can be followed by 3,4, and 3 by 4. So the 3-based permutations of the second column need to be counted twice, and the 4-based even three times; more generally so many times like there are smaller values in the previos colums. Multiply that to the current multiplier.
...
Some code:
#include<iostream>
#include<vector>
#include<algorithm>
using namespace std;
// factorial (x!)
unsigned long long fact(unsigned char x)
{
unsigned long long res = 1;
while(x)
{
res *= x;
x--;
}
return res;
}
//binomial coefficient (n choose k)
unsigned long long binom(unsigned char n, unsigned char k)
{
if(!n || !k) return 1;
return (fact(n) / fact(k)) / fact(n-k);
}
//just for convenience
template<class T> void printvector(std::vector<T> data)
{
for(auto l : data) cout << l << " ";
cout << endl;
}
std::vector<unsigned long long> calculate(std::vector<int> data, int k)
{
std::vector<unsigned long long> res(k, 0); //result data
std::vector<unsigned long long> multiplier(data.size(), 1);
if(k < 1 || k > 255 || data.size() < 1) return res; //invalid stuff
std::sort(data.begin(), data.end()); //as described
for(int column = 0; column < k; column++) //each column separately
{
//count what to multiply to the multiplier array later
std::vector<unsigned long long> newmultiplier(data.size(), 0);
//for each array element in this column
for(int x = column; x <= (data.size() + column - k); x++)
{
//core calculation
res[column] += data[x] * multiplier[x] * binom(data.size() - x - 1, k - column - 1);
//counting the new multiplier factor
for(int helper = x + 1; helper < data.size(); helper++)
newmultiplier[helper]++;
}
//calculating new multiplier
for(int x = 0; x < data.size(); x++)
{
if(newmultiplier[x])
multiplier[x] *= newmultiplier[x];
}
}
return res;
}
int main() {
printvector(calculate({1,2,3}, 2)); //output 4 8
return 0;
}

std::next_permutation may help:
std::vector<int> sub_set(const std::vector<int>& a, int k)
{
std::vector<int> res(k, 0);
std::vector<bool> p(a.size() - k, false);
p.resize(a.size(), true);
do
{
int index = 0;
for (std::size_t i = 0; i != p.size(); ++i) {
if (p[i]) {
res[index++] += a[i];
}
}
} while (std::next_permutation(p.begin(), p.end()));
return res;
}
Live Demo

Related

Biggest valued, 'H' shaped area in matrix?

I got the task where I must find the H shaped region which has the biggest sum of numbers in it. Under 'H' shaped region, tha task meant this, consisting of 7 elements and never changing:
x x
xxx
x x
The matrix's size must be 33 or bigger than that, and I don't have to work with rotated 'H' shape. However, it can move upwards and downwards if the matrix is that big (for example a 46 matrix).
I thought of first counting a "maximum" value, starting from the [0][0] element. However, I can't figure out how to move this region-counting along. Could you help me out, please?
Here's my code so far:
#include<iostream>
int main(){
int n = 3;
int m = 4;
int mtx[n][m] = {
1,1,1,3,
1,1,1,3,
1,1,1,3
};
//counting the maximum H value
int max = 0;
for(int i = 0; i < n; i++){
max += mtx[i][0];
}
for(int i = 0; i < n; i++){
max += mtx[i][2];
}
max += mtx[1][1];
int counter = 0;
int j = 0;
int k = 0;
//finding if there is bigger
while(counter >max){
//questioned area, not sure what to do here
if(counter < max){
max = counter;
}
}
return 0;
}
As mentioned in a comment, knowing the single maximum element in the matrix does not help to find the maximum H shape. A concrete counter example:
0 1 0 5
9 1 1 5
0 1 0 5
Maximum element is 9 but maximum sum H shape is
1 0 5
1 1 5
1 0 5
You would need to add more information to know whether the maximum element is part of the maximum H: Only if max_element + 6*min_element > 7*second_smallest_element you can be sure that max_element is part of the biggest sum H. This condition could be refined, but it cannot be made such that the biggest element is always part of the biggest sum H, because thats not true in general (see counter example above).
As suggested in another comment, you should write a function that given coordinates of the upper left corner calculates the sum of elements in the H shape:
#include <iostream>
#include <array>
int H_sum(const std::array<std::array<int,4>,3>& matrix, int x0,int y0){
// A E
// BDF
// C G
int sum = matrix[x0][y0]; // A
sum += matrix[x0+1][y0]; // B
sum += matrix[x0+2][y0]; // C
sum += matrix[x0+1][y0+1]; // D
sum += matrix[x0][y0+2]; // E
sum += matrix[x0+1][y0+2]; // F
sum += matrix[x0+2][y0+2]; // G
return sum;
}
int main() {
std::array<std::array<int,4>,3> mtx{
1,1,1,3,
1,1,1,3,
1,1,1,3
};
for (const auto& row : mtx){
for (const auto& e : row){
std::cout << e;
}
std::cout << "\n";
}
std::cout << H_sum(mtx,0,0);
}
This is of course only something to get you started. Next you have to carefully consider what are the maximum indices you can pass to H_sum without going out-of-bounds. Then write a nested loop to scan all positions of the H and remember the maximum value encountered.
Last but not least, what I described so far is a brute force approach. You calculate sum for all possible H shapes, remember the maximum, and are done. Maybe there is a clever trick to avoid adding all elements multiple times (for example in a larger matrix, the right leg of one H is the left leg of a different H). Though before applying such tricks and trying to be clever I strongly suggest to write something that is perhaps slow but correct, easy to read and verify.

O(n^2) algorithm to find largest 3 integer arithmetic series

The problem is fairly simple. Given an input of N (3 <= N <= 3000) integers, find the largest sum of a 3-integer arithmetic series in the sequence. Eg. (15, 8, 1) is a larger arithmetic series than (12, 7, 2) because 15 + 8 + 1 > 12 + 7 + 2. The integers apart of the largest arithmetic series do NOT have to be adjacent, and the order they appear in is irrelevant.
An example input would be:
6
1 6 11 2 7 12
where the first number is N (in this case, 6) and the second line is the sequence N integers long.
And the output would be the largest sum of any 3-integer arithmetic series. Like so:
21
because 2, 7 and 12 has the largest sum of any 3-integer arithmetic series in the sequence, and 2 + 7 + 12 = 21. It is also guaranteed that a 3-integer arithmetic series exists in the sequence.
EDIT: The numbers that make up the sum (output) have to be an arithmetic series (constant difference) that is 3 integers long. In the case of the sample input, (1 6 11) is a possible arithmetic series, but it is smaller than (2 7 12) because 2 + 7 + 12 > 1 + 6 + 11. Thus 21 would be outputted because it is larger.
Here is my attempt at solving this question in C++:
#include <bits/stdc++.h>
using namespace std;
vector<int> results;
vector<int> middle;
vector<int> diff;
int main(){
int n;
cin >> n;
int sizes[n];
for (int i = 0; i < n; i++){
int size;
cin >> size;
sizes[i] = size;
}
sort(sizes, sizes + n, greater<int>());
for (int i = 0; i < n; i++){
for (int j = i+1; j < n; j++){
int difference = sizes[i] - sizes[j];
diff.insert(diff.end(), difference);
middle.insert(middle.end(), sizes[j]);
}
}
for (size_t i = 0; i < middle.size(); i++){
int difference = middle[i] - diff[i];
for (int j = 0; j < n; j++){
if (sizes[j] == difference) results.insert(results.end(), middle[i]);
}
}
int max = 0;
for (size_t i = 0; i < results.size(); i++) {
if (results[i] > max) max = results[i];
}
int answer = max * 3;
cout << answer;
return 0;
}
My approach was to record what the middle number and the difference was using separate vectors, then loop through the vectors and search if the middle number minus the difference is in the array, where it gets added to another vector. Then the largest middle number is found and multiplied by 3 to get the sum. This approach made my algorithm go from O(n^3) to roughly O(n^2). However, the algorithm doesn't always produce the correct output (and I can't think of a test case where this doesn't work) every time, and since I'm using separate vectors, I get a std::bad_alloc error for large N values because I am probably using too much memory. The time limit in this question is 1.4 sec per test case, and memory limit is 64 MB.
Since N can only be max 3000, O(n^2) is sufficient. So what is an optimal O(n^2) solution (or better) to this problem?
So, a simple solution for this problem is to put all elements into an std::map to count their frequencies, then iterate over the first and second element in the arithmetic progression, then search the map for the third.
Iterating takes O(n^2) and map lookups and find() generally takes O(logn).
include <iostream>
#include <map>
using namespace std;
const int maxn = 3000;
int a[maxn+1];
map<int, int> freq;
int main()
{
int n; cin >> n;
for (int i = 1; i <= n; i++) {cin >> a[i]; freq[a[i]]++;} // inserting frequencies
int maxi = INT_MIN;
for (int i = 1; i <= n-1; i++)
{
for (int j = i+1; j <= n; j++)
{
int first = a[i], sec = a[j]; if (first > sec) {swap(first, sec);} //ensure that first is smaller than sec
int gap = sec - first; //calculating difference
if (gap == 0 && freq[first] >= 3) {maxi = max(maxi, first*3); } //if first = sec then calculate immidiately
else
{
int third1 = first - gap; //else there're two options for the third element
if (freq.find(third1) != freq.end() && gap != 0) {maxi = max(maxi, first + sec + third1); } //finding third element
}
}
}
cout << maxi;
}
Output : 21
Another test :
6
3 4 5 7 7 7
Output : 21
Another test :
5
10 10 9 8 7
Output : 27
You can try std::unordered_map to try and reduce the complexity even more.
Also see Why is "using namespace std;" considered bad practice?
The sum of a 3-element arithmetic progression is 3-times the middle element, so I would search around a middle element, and would start the search from the "upper" end of the "array" (and have it sorted). This way the first hit is the largest one. Also, the actual array would be a frequency-map, so elements are unique, but still track if any element has 3 copies, because that can become a hit (progression by 0).
I think it may be better to create the frequency-map first, and sort it later, simply because it may result in sorting fewer elements - though they are going to be pairs of value and count in this case.
function max3(arr){
let stats=new Map();
for(let value of arr)
stats.set(value,(stats.get(value) || 0)+1);
let array=Array.from(stats); // array of [value,count] arrays
array.sort((x,y)=>y[0]-x[0]); // sort by value, descending
for(let i=0;i<array.length;i++){
let [value,count]=array[i];
if(count>=3)
return 3*value;
for(let j=0;j<i;j++)
if(stats.has(2*value-array[j][0]))
return 3*value;
}
}
console.log(max3([1,6,11,2,7,12])); // original example
console.log(max3([3,4,5,7,7,7])); // an example of 3 identical elements
console.log(max3([10,10,9,8,7])); // an example from another answer
console.log(max3([1,2,11,6,7,12])); // example with non-adjacent elements
console.log(max3([3,7,1,1,1])); // check for finding lowest possible triplet too

Difficulty understanding someones source code to a solution of an IOI problem

Here is the link to the problem: https://ioi2019.az/source/Tasks/Day1/Shoes/NGA.pdf
Here is a brief explanation about the problem statement:
You are given an integer n in the range 1≤n≤1e5 which will be representing the amount of positive integers inside of the array, as-well as the amount of negative integers in an array(so the total size of the array will be 2n).
The problem wants you to find the minimum number of swaps needed in the array such that the negative value of a number and the absolute value of that negative number are adjacent to each other(such that -x is to the right of x)
Example:
n = 2;
the array inputed = {2, 1, -1, -2}
The minimum number of operations will be four:
2,1,-1,-2: 0 swaps
2,-1,1,-2: 1 swap(swapping 1 and -1)
2,-1,-2,1: 2 swaps (swapping 1 and -2)
2,-2,-1,1: 3 swaps (swapping -1 and -2)
-2,2,-1,1: 4 swaps (swapping 2 and -2)
The final answer will be four.
Another example:
the array inputed = {-2, 2, 2, -2, -2, 2}
The minimum swaps is one. Because we can just swap elements at index 2 and 3.
Final array: {-2,2,-2,2,-2,2}
When doing this question I got wrong answer and I decided to look at someones source code on git hub.
Here is the source code:
#include "shoes.h"
#include <bits/stdc++.h>
#define sz(v) ((int)(v).size())
using namespace std;
using lint = long long;
using pi = pair<int, int>;
const int MAXN = 200005;
struct bit{
int tree[MAXN];
void add(int x, int v){
for(int i=x; i<MAXN; i+=i&-i) tree[i] += v;
}
int query(int x){
int ret = 0;
for(int i=x; i; i-=i&-i) ret += tree[i];
return ret;
}
}bit;
lint count_swaps(vector<int> s) {
int n = sz(s) / 2;
lint ret = 0;
vector<pi> v;
vector<pi> ord[MAXN];
for(int i=0; i<sz(s); i++){
ord[abs(s[i])].emplace_back(s[i], i);
}
for(int i=1; i<=n; i++){
sort(ord[i].begin(), ord[i].end());
for(int j=0; j<sz(ord[i])/2; j++){
int l = ord[i][j].second;
int r = ord[i][j + sz(ord[i])/2].second; //confusion starts here all the way to the buttom
if(l > r){
swap(l, r);
ret++;
}
v.emplace_back(l + 1, r + 1);
}
}
for(int i=1; i<=2*n; i++) bit.add(i, 1);
sort(v.begin(), v.end());
for(auto &i : v){
ret += bit.query(i.second - 1) - bit.query(i.first);
bit.add(i.first, -1);
bit.add(i.second, -1);
}
return ret;
}
However, I dont think I understand the this code too well.
I understand what the functions add and query in BIT do I'm just confused on where I commented on the code all the way to the bottom. I dont understand what it does and what the purpose of that is.
Can someone walk through what this code is doing? Or give any suggestions to how I should properly and efficiently approach this problem(even maybe your solutions?). Thank you.
int r = ord[i][j + sz(ord[i])/2].second;
We've sorted the tuples of one shoe size in a vector of <size, idx>, which means all the negatives of this size take up the first half of ord[i], and all the positives are in the second half.
if (l > r){
swap(l, r);
ret++;
}
After our sort on size, the indexes of each corresponding pair may not be ordered with the negative before the positive. Each one of those costs a swap.
v.emplace_back(l + 1, r + 1);
insert into v our interval for the corresponding pair of shoes of size i.
for(int i=1; i<=2*n; i++) bit.add(i, 1);
sort(v.begin(), v.end());
Add the value of 1 in our segment-sum tree for each index location of a shoe. Sort the shoe intervals.
for(auto &i : v){
ret += bit.query(i.second - 1) - bit.query(i.first);
For each pair of shoes in v, the number of swaps needed is the number of shoes left in between them, expressed in the sum of the segment.
bit.add(i.first, -1);
bit.add(i.second, -1);
Remove the pair of shoes from the tree so a new segment sum won't include them. We can do this since the shoe intervals are processed left to right, which means no "inner" pair of shoes gets processed before an outer pair.

Find the length of longest consecutive sub-sequence of a sequence in O(n) time where all elements are less than 10^6

I have to find the the length of largest increasing sub-sequence of an array such that difference between any two consecutive elements of sub-sequence is 1
For example: {5,4,2,1,6,2,3,4,5}
length of largest consecutive increasing sub-sequence : 5 {1,2,3,4,5}
SO far I have tried this:
#include <iostream>
using namespace std;
int a[1000001];
int m[1000001]={0};
int main()
{
int n;
cin>>n;
for(int i=1;i<=n;i++)
{
cin>>a[i];
m[a[i]]=i;
}
int maxm=0;
for(int i=1;i<=n;i++)
{
if(m[a[i]-1]==0 || m[a[i]]<=m[a[i]-1])
{
int k=a[i];
int prev = m[k];
k++;
int c=1;
while(m[k]>prev)
{
c++;
prev=m[k];
k++;
}
maxm=max(maxm,c);
}
}
cout<<maxm;
return 0;
}
But this is giving wrong answer for cases like{2,2,1,2,3,1,2,3,4,3,5}
Any help would be appreciated.
Let's discuss the algorithm here rather than jumping to the answer/code.
Associate a value with each element. The value with any element X will be how many elements from X-1 till 1 have I seen before I encountered X and add 1 to the value because now we have encountered X also.
So since an element of an array is strictly between 1 <= A[i] <= 106 we are in luck.
We make an array for each of the elements, whether they appear in the array or not. This kind of approach is similar to Hash Table
but since all our elements are integers, we are using an array as a simple hash table where key is the index of the array and value is the hash_table[index] i.e.. the value stores in the index.
Now lets dry run our approach for one of our sample inputs :
5 1 5 6 2 3 8 7 4
Initiall the hash-table looks like this :
hash_table = {0,0,0,0,0,0,0,0,0}; // Not showing indices > 8 because they won't be affected.
Now we encounter 5 :
We look up the value of hash_table[4] and add 1 and put it as the value of 5 i.e. hash_table[5] = hash_table[4] + 1
So hash table looks like this now :
hash_table = {0,0,0,0,0,1,0,0,0};
Then we encounter 1 : we do the same thing :
hash_table = {0,1,0,0,0,1,0,0,0};
Like that after taking in all the numbers hash_table looks like this :
hash_table = {0,1,2,3,4,1,2,3,1}
Our answer is the maximum value of the hash_table, which is 4.
Talk is cheap show me the code :
#include <stdio.h>
#define MAX (int)1e6
int h[MAX];
int main ()
{
int N,i,max=0,temp;
scanf ("%d",&N);
for (i=0;i<N;i++)
{
scanf ("%d",&temp);
h[temp] = h[temp - 1] + 1;
if (h[temp] > max)
max = h[temp];
}
printf ("%d\n",max);
return 0;
}
So what if you can't upvote. You can still accept this answer if you found it useful !
You are thinking a bit too complicated. You just have to iterate through the array once and count the lenght of sequences and remeber the longest one :
int main() {
int size;
int input[100000];
/* ... get your input with size elements ... */
int current = 1;
int biggest = 1;
for (int i=1;i<size;i++) {
if (input[i] == input[i-1] + 1) { current++; }
else {
if (current > biggest) { biggest = current; }
current = 1;
}
}
}

How to reduce complexity of this code

Please can any one provide with a better algorithm then trying all the combinations for this problem.
Given an array A of N numbers, find the number of distinct pairs (i,
j) such that j >=i and A[i] = A[j].
First line of the input contains number of test cases T. Each test
case has two lines, first line is the number N, followed by a line
consisting of N integers which are the elements of array A.
For each test case print the number of distinct pairs.
Constraints:
1 <= T <= 10
1 <= N <= 10^6
-10^6 <= A[i] <= 10^6 for 0 <= i < N
I think that first sorting the array then finding frequency of every distinct integer and then adding nC2 of all the frequencies plus adding the length of the string at last. But unfortunately it gives wrong ans for some cases which are not known help. here is the implementation.
code:
#include <iostream>
#include<cstdio>
#include<algorithm>
using namespace std;
long fun(long a) //to find the aC2 for given a
{
if (a == 1) return 0;
return (a * (a - 1)) / 2;
}
int main()
{
long t, i, j, n, tmp = 0;
long long count;
long ar[1000000];
cin >> t;
while (t--)
{
cin >> n;
for (i = 0; i < n; i++)
{
cin >> ar[i];
}
count = 0;
sort(ar, ar + n);
for (i = 0; i < n - 1; i++)
{
if (ar[i] == ar[i + 1])
{
tmp++;
}
else
{
count += fun(tmp + 1);
tmp = 0;
}
}
if (tmp != 0)
{
count += fun(tmp + 1);
}
cout << count + n << "\n";
}
return 0;
}
Keep a count of how many times each number appears in an array. Then iterate over the result array and add the triangular number for each.
For example(from the source test case):
Input:
3
1 2 1
count array = {0, 2, 1} // no zeroes, two ones, one two
pairs = triangle(0) + triangle(2) + triangle(1)
pairs = 0 + 3 + 1
pairs = 4
Triangle numbers can be computed by (n * n + n) / 2, and the whole thing is O(n).
Edit:
First, there's no need to sort if you're counting frequency. I see what you did with sorting, but if you just keep a separate array of frequencies, it's easier. It takes more space, but since the elements and array length are both restrained to < 10^6, the max you'll need is an int[10^6]. This easily fits in the 256MB space requirements given in the challenge. (whoops, since elements can go negative, you'll need an array twice that size. still well under the limit, though)
For the n choose 2 part, the part you had wrong is that it's an n+1 choose 2 problem. Since you can pair each one by itself, you have to add one to n. I know you were adding n at the end, but it's not the same. The difference between tri(n) and tri(n+1) is not one, but n.