How to reduce complexity of this code - c++

Please can any one provide with a better algorithm then trying all the combinations for this problem.
Given an array A of N numbers, find the number of distinct pairs (i,
j) such that j >=i and A[i] = A[j].
First line of the input contains number of test cases T. Each test
case has two lines, first line is the number N, followed by a line
consisting of N integers which are the elements of array A.
For each test case print the number of distinct pairs.
Constraints:
1 <= T <= 10
1 <= N <= 10^6
-10^6 <= A[i] <= 10^6 for 0 <= i < N
I think that first sorting the array then finding frequency of every distinct integer and then adding nC2 of all the frequencies plus adding the length of the string at last. But unfortunately it gives wrong ans for some cases which are not known help. here is the implementation.
code:
#include <iostream>
#include<cstdio>
#include<algorithm>
using namespace std;
long fun(long a) //to find the aC2 for given a
{
if (a == 1) return 0;
return (a * (a - 1)) / 2;
}
int main()
{
long t, i, j, n, tmp = 0;
long long count;
long ar[1000000];
cin >> t;
while (t--)
{
cin >> n;
for (i = 0; i < n; i++)
{
cin >> ar[i];
}
count = 0;
sort(ar, ar + n);
for (i = 0; i < n - 1; i++)
{
if (ar[i] == ar[i + 1])
{
tmp++;
}
else
{
count += fun(tmp + 1);
tmp = 0;
}
}
if (tmp != 0)
{
count += fun(tmp + 1);
}
cout << count + n << "\n";
}
return 0;
}

Keep a count of how many times each number appears in an array. Then iterate over the result array and add the triangular number for each.
For example(from the source test case):
Input:
3
1 2 1
count array = {0, 2, 1} // no zeroes, two ones, one two
pairs = triangle(0) + triangle(2) + triangle(1)
pairs = 0 + 3 + 1
pairs = 4
Triangle numbers can be computed by (n * n + n) / 2, and the whole thing is O(n).
Edit:
First, there's no need to sort if you're counting frequency. I see what you did with sorting, but if you just keep a separate array of frequencies, it's easier. It takes more space, but since the elements and array length are both restrained to < 10^6, the max you'll need is an int[10^6]. This easily fits in the 256MB space requirements given in the challenge. (whoops, since elements can go negative, you'll need an array twice that size. still well under the limit, though)
For the n choose 2 part, the part you had wrong is that it's an n+1 choose 2 problem. Since you can pair each one by itself, you have to add one to n. I know you were adding n at the end, but it's not the same. The difference between tri(n) and tri(n+1) is not one, but n.

Related

Need optimization tips for a subset sum like problem with a big constraint

Given a number 1 <= N <= 3*10^5, count all subsets in the set {1, 2, ..., N-1} that sum up to N. This is essentially a modified version of the subset sum problem, but with a modification that the sum and number of elements are the same, and that the set/array increases linearly by 1 to N-1.
I think i have solved this using dp ordered map and inclusion/exclusion recursive algorithm, but due to the time and space complexity i can't compute more than 10000 elements.
#include <iostream>
#include <chrono>
#include <map>
#include "bigint.h"
using namespace std;
//2d hashmap to store values from recursion; keys- i & sum; value- count
map<pair<int, int>, bigint> hmap;
bigint counter(int n, int i, int sum){
//end case
if(i == 0){
if(sum == 0){
return 1;
}
return 0;
}
//alternative end case if its sum is zero before it has finished iterating through all of the possible combinations
if(sum == 0){
return 1;
}
//case if the result of the recursion is already in the hashmap
if(hmap.find(make_pair(i, sum)) != hmap.end()){
return hmap[make_pair(i, sum)];
}
//only proceed further recursion if resulting sum wouldnt be negative
if(sum - i < 0){
//optimization that skips unecessary recursive branches
return hmap[make_pair(i, sum)] = counter(n, sum, sum);
}
else{
//include the number dont include the number
return hmap[make_pair(i, sum)] = counter(n, i - 1, sum - i) + counter(n, i - 1, sum);
}
}
The function has starting values of N, N-1, and N, indicating number of elements, iterator(which decrements) and the sum of the recursive branch(which decreases with every included value).
This is the code that calculates the number of the subsets. for input of 3000 it takes around ~22 seconds to output the result which is 40 digits long. Because of the long digits i had to use an arbitrary precision library bigint from rgroshanrg, which works fine for values less than ~10000. Testing beyond that gives me a segfault on line 28-29, maybe due to the stored arbitrary precision values becoming too big and conflicting in the map. I need to somehow up this code so it can work with values beyond 10000 but i am stumped with it. Any ideas or should i switch towards another algorithm and data storage?
Here is a different algorithm, described in a paper by Evangelos Georgiadis, "Computing Partition Numbers q(n)":
std::vector<BigInt> RestrictedPartitionNumbers(int n)
{
std::vector<BigInt> q(n, 0);
// initialize q with A010815
for (int i = 0; ; i++)
{
int n0 = i * (3 * i - 1) >> 1;
if (n0 >= q.size())
break;
q[n0] = 1 - 2 * (i & 1);
int n1 = i * (3 * i + 1) >> 1;
if (n1 < q.size())
q[n1] = 1 - 2 * (i & 1);
}
// construct A000009 as per "Evangelos Georgiadis, Computing Partition Numbers q(n)"
for (size_t k = 0; k < q.size(); k++)
{
size_t j = 1;
size_t m = k + 1;
while (m < q.size())
{
if ((j & 1) != 0)
q[m] += q[k] << 1;
else
q[m] -= q[k] << 1;
j++;
m = k + j * j;
}
}
return q;
}
It's not the fastest algorithm out there, and this took about half a minute for on my computer for n = 300000. But you only need to do it once (since it computes all partition numbers up to some bound) and it doesn't take a lot of memory (a bit over 150MB).
The results go up to but excluding n, and they assume that for each number, that number itself is allowed to be a partition of itself eg the set {4} is a partition of the number 4, in your definition of the problem you excluded that case so you need to subtract 1 from the result.
Maybe there's a nicer way to express A010815, that part of the code isn't slow though, I just think it looks bad.

O(n^2) algorithm to find largest 3 integer arithmetic series

The problem is fairly simple. Given an input of N (3 <= N <= 3000) integers, find the largest sum of a 3-integer arithmetic series in the sequence. Eg. (15, 8, 1) is a larger arithmetic series than (12, 7, 2) because 15 + 8 + 1 > 12 + 7 + 2. The integers apart of the largest arithmetic series do NOT have to be adjacent, and the order they appear in is irrelevant.
An example input would be:
6
1 6 11 2 7 12
where the first number is N (in this case, 6) and the second line is the sequence N integers long.
And the output would be the largest sum of any 3-integer arithmetic series. Like so:
21
because 2, 7 and 12 has the largest sum of any 3-integer arithmetic series in the sequence, and 2 + 7 + 12 = 21. It is also guaranteed that a 3-integer arithmetic series exists in the sequence.
EDIT: The numbers that make up the sum (output) have to be an arithmetic series (constant difference) that is 3 integers long. In the case of the sample input, (1 6 11) is a possible arithmetic series, but it is smaller than (2 7 12) because 2 + 7 + 12 > 1 + 6 + 11. Thus 21 would be outputted because it is larger.
Here is my attempt at solving this question in C++:
#include <bits/stdc++.h>
using namespace std;
vector<int> results;
vector<int> middle;
vector<int> diff;
int main(){
int n;
cin >> n;
int sizes[n];
for (int i = 0; i < n; i++){
int size;
cin >> size;
sizes[i] = size;
}
sort(sizes, sizes + n, greater<int>());
for (int i = 0; i < n; i++){
for (int j = i+1; j < n; j++){
int difference = sizes[i] - sizes[j];
diff.insert(diff.end(), difference);
middle.insert(middle.end(), sizes[j]);
}
}
for (size_t i = 0; i < middle.size(); i++){
int difference = middle[i] - diff[i];
for (int j = 0; j < n; j++){
if (sizes[j] == difference) results.insert(results.end(), middle[i]);
}
}
int max = 0;
for (size_t i = 0; i < results.size(); i++) {
if (results[i] > max) max = results[i];
}
int answer = max * 3;
cout << answer;
return 0;
}
My approach was to record what the middle number and the difference was using separate vectors, then loop through the vectors and search if the middle number minus the difference is in the array, where it gets added to another vector. Then the largest middle number is found and multiplied by 3 to get the sum. This approach made my algorithm go from O(n^3) to roughly O(n^2). However, the algorithm doesn't always produce the correct output (and I can't think of a test case where this doesn't work) every time, and since I'm using separate vectors, I get a std::bad_alloc error for large N values because I am probably using too much memory. The time limit in this question is 1.4 sec per test case, and memory limit is 64 MB.
Since N can only be max 3000, O(n^2) is sufficient. So what is an optimal O(n^2) solution (or better) to this problem?
So, a simple solution for this problem is to put all elements into an std::map to count their frequencies, then iterate over the first and second element in the arithmetic progression, then search the map for the third.
Iterating takes O(n^2) and map lookups and find() generally takes O(logn).
include <iostream>
#include <map>
using namespace std;
const int maxn = 3000;
int a[maxn+1];
map<int, int> freq;
int main()
{
int n; cin >> n;
for (int i = 1; i <= n; i++) {cin >> a[i]; freq[a[i]]++;} // inserting frequencies
int maxi = INT_MIN;
for (int i = 1; i <= n-1; i++)
{
for (int j = i+1; j <= n; j++)
{
int first = a[i], sec = a[j]; if (first > sec) {swap(first, sec);} //ensure that first is smaller than sec
int gap = sec - first; //calculating difference
if (gap == 0 && freq[first] >= 3) {maxi = max(maxi, first*3); } //if first = sec then calculate immidiately
else
{
int third1 = first - gap; //else there're two options for the third element
if (freq.find(third1) != freq.end() && gap != 0) {maxi = max(maxi, first + sec + third1); } //finding third element
}
}
}
cout << maxi;
}
Output : 21
Another test :
6
3 4 5 7 7 7
Output : 21
Another test :
5
10 10 9 8 7
Output : 27
You can try std::unordered_map to try and reduce the complexity even more.
Also see Why is "using namespace std;" considered bad practice?
The sum of a 3-element arithmetic progression is 3-times the middle element, so I would search around a middle element, and would start the search from the "upper" end of the "array" (and have it sorted). This way the first hit is the largest one. Also, the actual array would be a frequency-map, so elements are unique, but still track if any element has 3 copies, because that can become a hit (progression by 0).
I think it may be better to create the frequency-map first, and sort it later, simply because it may result in sorting fewer elements - though they are going to be pairs of value and count in this case.
function max3(arr){
let stats=new Map();
for(let value of arr)
stats.set(value,(stats.get(value) || 0)+1);
let array=Array.from(stats); // array of [value,count] arrays
array.sort((x,y)=>y[0]-x[0]); // sort by value, descending
for(let i=0;i<array.length;i++){
let [value,count]=array[i];
if(count>=3)
return 3*value;
for(let j=0;j<i;j++)
if(stats.has(2*value-array[j][0]))
return 3*value;
}
}
console.log(max3([1,6,11,2,7,12])); // original example
console.log(max3([3,4,5,7,7,7])); // an example of 3 identical elements
console.log(max3([10,10,9,8,7])); // an example from another answer
console.log(max3([1,2,11,6,7,12])); // example with non-adjacent elements
console.log(max3([3,7,1,1,1])); // check for finding lowest possible triplet too

Max Sum Subarray with Partition constraint using Dynamic Programming

Problem statement: Given a set of n coins of some denominations (maybe repeating, in random order), and a number k. A game is being played by a single player in the following manner: Player can choose to pick 0 to k coins contiguously but will have to leave one next coin from picking. In this manner give the highest sum of coins he/she can collect.
Input:
First line contains 2 space-separated integers n and x respectively, which denote
n - Size of the array
x - Window size
Output:
A single integer denoting the max sum the player can obtain.
Working Soln Link: Ideone
long long solve(int n, int x) {
if (n == 0) return 0;
long long total = accumulate(arr + 1, arr + n + 1, 0ll);
if (x >= n) return total;
multiset<long long> dp_x;
for (int i = 1; i <= x + 1; i++) {
dp[i] = arr[i];
dp_x.insert(dp[i]);
}
for (int i = x + 2; i <= n; i++) {
dp[i] = arr[i] + *dp_x.begin();
dp_x.erase(dp_x.find(dp[i - x - 1]));
dp_x.insert(dp[i]);
}
long long ans = total;
for (int i = n - x; i <= n; i++) {
ans = min(ans, dp[i]);
}
return total - ans;
}
Can someone kindly explain how this code is working i.e., how line no. 12-26 in the Ideone solution is producing the correct answer?
I have dry run the code using pen and paper and found that it's giving the correct answer but couldn't figure out the algorithm used(if any). Can someone kindly explain to me how Line No. 12-26 is producing the correct answer? Is there any technique or algorithm at use here?
I am new to DP, so if someone can point out a tutorial(YouTube video, etc) related to this kind of problem, that would be great too. Thank you.
It looks like the idea is converting the problem - You must choose at least one coin in no more than x+1 coins in a row, and make it minimal. Then the original problem's answer would just be [sum of all values] - [answer of the new problem].
Then we're ready to talk about dynamic programming. Let's define a recurrence relation for f(i) which means "the partial answer of the new problem considering 1st to i-th coins, and i-th coin is chosen". (Sorry about the bad description, edits welcome)
f(i) = a(i) : if (i<=x+1)
f(i) = a(i) + min(f(i-1),f(i-2),...,f(i-x-1)) : otherwise
where a(i) is the i-th coin value
I added some comments line by line.
// NOTE f() is dp[] and a() is arr[]
long long solve(int n, int x) {
if (n == 0) return 0;
long long total = accumulate(arr + 1, arr + n + 1, 0ll); // get the sum
if (x >= n) return total;
multiset<long long> dp_x; // A min-heap (with fast random access)
for (int i = 1; i <= x + 1; i++) { // For 1 to (x+1)th,
dp[i] = arr[i]; // f(i) = a(i)
dp_x.insert(dp[i]); // Push the value to the heap
}
for (int i = x + 2; i <= n; i++) { // For the rest,
dp[i] = arr[i] + *dp_x.begin(); // f(i) = a(i) + min(...)
dp_x.erase(dp_x.find(dp[i - x - 1])); // Erase the oldest one from the heap
dp_x.insert(dp[i]); // Push the value to the heap, so it keeps the latest x+1 elements
}
long long ans = total;
for (int i = n - x; i <= n; i++) { // Find minimum of dp[] (among candidate answers)
ans = min(ans, dp[i]);
}
return total - ans;
}
Please also note that multiset is used as a min-heap. However we also need quick random-access(to erase the old ones) and multiset can do it in logarithmic time. So, the overall time complexity is O(n log x).

Improving optimization of nested loop

I'm making a simple program to calculate the number of pairs in an array that are divisible by 3 array length and values are user determined.
Now my code is perfectly fine. However, I just want to check if there is a faster way to calculate it which results in less compiling time?
As the length of the array is 10^4 or less compiler takes less than 100ms. However, as it gets more to 10^5 it spikes up to 1000ms so why is this? and how to improve speed?
#include <iostream>
using namespace std;
int main()
{
int N, i, b;
b = 0;
cin >> N;
unsigned int j = 0;
std::vector<unsigned int> a(N);
for (j = 0; j < N; j++) {
cin >> a[j];
if (j == 0) {
}
else {
for (i = j - 1; i >= 0; i = i - 1) {
if ((a[j] + a[i]) % 3 == 0) {
b++;
}
}
}
}
cout << b;
return 0;
}
Your algorithm has O(N^2) complexity. There is a faster way.
(a[i] + a[j]) % 3 == ((a[i] % 3) + (a[j] % 3)) % 3
Thus, you need not know the exact numbers, you need to know their remainders of division by three only. Zero remainder of the sum can be received with two numbers with zero remainders (0 + 0) and with two numbers with remainders 1 and 2 (1 + 2).
The result will be equal to r[1]*r[2] + r[0]*(r[0]-1)/2 where r[i] is the quantity of numbers with remainder equal to i.
int r[3] = {};
for (int i : a) {
r[i % 3]++;
}
std::cout << r[1]*r[2] + (r[0]*(r[0]-1)) / 2;
The complexity of this algorithm is O(N).
I've encountered this problem before, and while I don't find my particular solution, you could improve running times by hashing.
The code would look something like this:
// A C++ program to check if arr[0..n-1] can be divided
// in pairs such that every pair is divisible by k.
#include <bits/stdc++.h>
using namespace std;
// Returns true if arr[0..n-1] can be divided into pairs
// with sum divisible by k.
bool canPairs(int arr[], int n, int k)
{
// An odd length array cannot be divided into pairs
if (n & 1)
return false;
// Create a frequency array to count occurrences
// of all remainders when divided by k.
map<int, int> freq;
// Count occurrences of all remainders
for (int i = 0; i < n; i++)
freq[arr[i] % k]++;
// Traverse input array and use freq[] to decide
// if given array can be divided in pairs
for (int i = 0; i < n; i++)
{
// Remainder of current element
int rem = arr[i] % k;
// If remainder with current element divides
// k into two halves.
if (2*rem == k)
{
// Then there must be even occurrences of
// such remainder
if (freq[rem] % 2 != 0)
return false;
}
// If remainder is 0, then there must be two
// elements with 0 remainder
else if (rem == 0)
{
if (freq[rem] & 1)
return false;
}
// Else number of occurrences of remainder
// must be equal to number of occurrences of
// k - remainder
else if (freq[rem] != freq[k - rem])
return false;
}
return true;
}
/* Driver program to test above function */
int main()
{
int arr[] = {92, 75, 65, 48, 45, 35};
int k = 10;
int n = sizeof(arr)/sizeof(arr[0]);
canPairs(arr, n, k)? cout << "True": cout << "False";
return 0;
}
That works for a k (in your case 3)
But then again, this is not my code, but the code you can find in the following link. with a proper explanation. I didn't just paste the link since it's bad practice I think.

Given number N eliminate K digits to get maximum possible number

As the title says, the task is:
Given number N eliminate K digits to get maximum possible number. The digits must remain at their positions.
Example: n = 12345, k = 3, max = 45 (first three digits eliminated and digits mustn't be moved to another position).
Any idea how to solve this?
(It's not homework, I am preparing for an algorithm contest and solve problems on online judges.)
1 <= N <= 2^60, 1 <= K <= 20.
Edit: Here is my solution. It's working :)
#include <iostream>
#include <string>
#include <queue>
#include <vector>
#include <iomanip>
#include <algorithm>
#include <cmath>
using namespace std;
int main()
{
string n;
int k;
cin >> n >> k;
int b = n.size() - k - 1;
int c = n.size() - b;
int ind = 0;
vector<char> res;
char max = n.at(0);
for (int i=0; i<n.size() && res.size() < n.size()-k; i++) {
max = n.at(i);
ind = i;
for (int j=i; j<i+c; j++) {
if (n.at(j) > max) {
max = n.at(j);
ind = j;
}
}
b--;
c = n.size() - 1 - ind - b;
res.push_back(max);
i = ind;
}
for (int i=0; i<res.size(); i++)
cout << res.at(i);
cout << endl;
return 0;
}
Brute force should be fast enough for your restrictions: n will have max 19 digits. Generate all positive integers with numDigits(n) bits. If k bits are set, then remove the digits at positions corresponding to the set bits. Compare the result with the global optimum and update if needed.
Complexity: O(2^log n * log n). While this may seem like a lot and the same thing as O(n) asymptotically, it's going to be much faster in practice, because the logarithm in O(2^log n * log n) is a base 10 logarithm, which will give a much smaller value (1 + log base 10 of n gives you the number of digits of n).
You can avoid the log n factor by generating combinations of n taken n - k at a time and building the number made up of the chosen n - k positions as you generate each combination (pass it as a parameter). This basically means you solve the similar problem: given n, pick n - k digits in order such that the resulting number is maximum).
Note: there is a method to solve this that does not involve brute force, but I wanted to show the OP this solution as well, since he asked how it could be brute forced in the comments. For the optimal method, investigate what would happen if we built our number digit by digit from left to right, and, for each digit d, we would remove all currently selected digits that are smaller than it. When can we remove them and when can't we?
In the leftmost k+1 digits, find the largest one (let us say it is located at ith location. In case there are multiple occurrences choose the leftmost one). Keep it. Repeat the algorithm for k_new = k-i+1, newNumber = i+1 to n digits of the original number.
Eg. k=5 and number = 7454982641
First k+1 digits: 745498
Best number is 9 and it is located at location i=5.
new_k=1, new number = 82641
First k+1 digits: 82
Best number is 8 and it is located at i=1.
new_k=1, new number = 2641
First k+1 digits: 26
Best number is 6 and it is located at i=2
new_k=0, new number = 41
Answer: 98641
Complexity is O(n) where n is the size of the input number.
Edit: As iVlad mentioned, in the worst case complexity can be quadratic. You can avoid that by maintaining a heap of size at most k+1 which will increase complexity to O(nlogk).
Following may help:
void removeNumb(std::vector<int>& v, int k)
{
if (k == 0) { return; }
if (k >= v.size()) {
v.clear();
return;
}
for (int i = 0; i != v.size() - 1; )
{
if (v[i] < v[i + 1]) {
v.erase(v.begin() + i);
if (--k == 0) { return; }
i = std::max(i - 1, 0);
} else {
++i;
}
}
v.resize(v.size() - k);
}