Given number N eliminate K digits to get maximum possible number - c++

As the title says, the task is:
Given number N eliminate K digits to get maximum possible number. The digits must remain at their positions.
Example: n = 12345, k = 3, max = 45 (first three digits eliminated and digits mustn't be moved to another position).
Any idea how to solve this?
(It's not homework, I am preparing for an algorithm contest and solve problems on online judges.)
1 <= N <= 2^60, 1 <= K <= 20.
Edit: Here is my solution. It's working :)
#include <iostream>
#include <string>
#include <queue>
#include <vector>
#include <iomanip>
#include <algorithm>
#include <cmath>
using namespace std;
int main()
{
string n;
int k;
cin >> n >> k;
int b = n.size() - k - 1;
int c = n.size() - b;
int ind = 0;
vector<char> res;
char max = n.at(0);
for (int i=0; i<n.size() && res.size() < n.size()-k; i++) {
max = n.at(i);
ind = i;
for (int j=i; j<i+c; j++) {
if (n.at(j) > max) {
max = n.at(j);
ind = j;
}
}
b--;
c = n.size() - 1 - ind - b;
res.push_back(max);
i = ind;
}
for (int i=0; i<res.size(); i++)
cout << res.at(i);
cout << endl;
return 0;
}

Brute force should be fast enough for your restrictions: n will have max 19 digits. Generate all positive integers with numDigits(n) bits. If k bits are set, then remove the digits at positions corresponding to the set bits. Compare the result with the global optimum and update if needed.
Complexity: O(2^log n * log n). While this may seem like a lot and the same thing as O(n) asymptotically, it's going to be much faster in practice, because the logarithm in O(2^log n * log n) is a base 10 logarithm, which will give a much smaller value (1 + log base 10 of n gives you the number of digits of n).
You can avoid the log n factor by generating combinations of n taken n - k at a time and building the number made up of the chosen n - k positions as you generate each combination (pass it as a parameter). This basically means you solve the similar problem: given n, pick n - k digits in order such that the resulting number is maximum).
Note: there is a method to solve this that does not involve brute force, but I wanted to show the OP this solution as well, since he asked how it could be brute forced in the comments. For the optimal method, investigate what would happen if we built our number digit by digit from left to right, and, for each digit d, we would remove all currently selected digits that are smaller than it. When can we remove them and when can't we?

In the leftmost k+1 digits, find the largest one (let us say it is located at ith location. In case there are multiple occurrences choose the leftmost one). Keep it. Repeat the algorithm for k_new = k-i+1, newNumber = i+1 to n digits of the original number.
Eg. k=5 and number = 7454982641
First k+1 digits: 745498
Best number is 9 and it is located at location i=5.
new_k=1, new number = 82641
First k+1 digits: 82
Best number is 8 and it is located at i=1.
new_k=1, new number = 2641
First k+1 digits: 26
Best number is 6 and it is located at i=2
new_k=0, new number = 41
Answer: 98641
Complexity is O(n) where n is the size of the input number.
Edit: As iVlad mentioned, in the worst case complexity can be quadratic. You can avoid that by maintaining a heap of size at most k+1 which will increase complexity to O(nlogk).

Following may help:
void removeNumb(std::vector<int>& v, int k)
{
if (k == 0) { return; }
if (k >= v.size()) {
v.clear();
return;
}
for (int i = 0; i != v.size() - 1; )
{
if (v[i] < v[i + 1]) {
v.erase(v.begin() + i);
if (--k == 0) { return; }
i = std::max(i - 1, 0);
} else {
++i;
}
}
v.resize(v.size() - k);
}

Related

Need optimization tips for a subset sum like problem with a big constraint

Given a number 1 <= N <= 3*10^5, count all subsets in the set {1, 2, ..., N-1} that sum up to N. This is essentially a modified version of the subset sum problem, but with a modification that the sum and number of elements are the same, and that the set/array increases linearly by 1 to N-1.
I think i have solved this using dp ordered map and inclusion/exclusion recursive algorithm, but due to the time and space complexity i can't compute more than 10000 elements.
#include <iostream>
#include <chrono>
#include <map>
#include "bigint.h"
using namespace std;
//2d hashmap to store values from recursion; keys- i & sum; value- count
map<pair<int, int>, bigint> hmap;
bigint counter(int n, int i, int sum){
//end case
if(i == 0){
if(sum == 0){
return 1;
}
return 0;
}
//alternative end case if its sum is zero before it has finished iterating through all of the possible combinations
if(sum == 0){
return 1;
}
//case if the result of the recursion is already in the hashmap
if(hmap.find(make_pair(i, sum)) != hmap.end()){
return hmap[make_pair(i, sum)];
}
//only proceed further recursion if resulting sum wouldnt be negative
if(sum - i < 0){
//optimization that skips unecessary recursive branches
return hmap[make_pair(i, sum)] = counter(n, sum, sum);
}
else{
//include the number dont include the number
return hmap[make_pair(i, sum)] = counter(n, i - 1, sum - i) + counter(n, i - 1, sum);
}
}
The function has starting values of N, N-1, and N, indicating number of elements, iterator(which decrements) and the sum of the recursive branch(which decreases with every included value).
This is the code that calculates the number of the subsets. for input of 3000 it takes around ~22 seconds to output the result which is 40 digits long. Because of the long digits i had to use an arbitrary precision library bigint from rgroshanrg, which works fine for values less than ~10000. Testing beyond that gives me a segfault on line 28-29, maybe due to the stored arbitrary precision values becoming too big and conflicting in the map. I need to somehow up this code so it can work with values beyond 10000 but i am stumped with it. Any ideas or should i switch towards another algorithm and data storage?
Here is a different algorithm, described in a paper by Evangelos Georgiadis, "Computing Partition Numbers q(n)":
std::vector<BigInt> RestrictedPartitionNumbers(int n)
{
std::vector<BigInt> q(n, 0);
// initialize q with A010815
for (int i = 0; ; i++)
{
int n0 = i * (3 * i - 1) >> 1;
if (n0 >= q.size())
break;
q[n0] = 1 - 2 * (i & 1);
int n1 = i * (3 * i + 1) >> 1;
if (n1 < q.size())
q[n1] = 1 - 2 * (i & 1);
}
// construct A000009 as per "Evangelos Georgiadis, Computing Partition Numbers q(n)"
for (size_t k = 0; k < q.size(); k++)
{
size_t j = 1;
size_t m = k + 1;
while (m < q.size())
{
if ((j & 1) != 0)
q[m] += q[k] << 1;
else
q[m] -= q[k] << 1;
j++;
m = k + j * j;
}
}
return q;
}
It's not the fastest algorithm out there, and this took about half a minute for on my computer for n = 300000. But you only need to do it once (since it computes all partition numbers up to some bound) and it doesn't take a lot of memory (a bit over 150MB).
The results go up to but excluding n, and they assume that for each number, that number itself is allowed to be a partition of itself eg the set {4} is a partition of the number 4, in your definition of the problem you excluded that case so you need to subtract 1 from the result.
Maybe there's a nicer way to express A010815, that part of the code isn't slow though, I just think it looks bad.

O(n^2) algorithm to find largest 3 integer arithmetic series

The problem is fairly simple. Given an input of N (3 <= N <= 3000) integers, find the largest sum of a 3-integer arithmetic series in the sequence. Eg. (15, 8, 1) is a larger arithmetic series than (12, 7, 2) because 15 + 8 + 1 > 12 + 7 + 2. The integers apart of the largest arithmetic series do NOT have to be adjacent, and the order they appear in is irrelevant.
An example input would be:
6
1 6 11 2 7 12
where the first number is N (in this case, 6) and the second line is the sequence N integers long.
And the output would be the largest sum of any 3-integer arithmetic series. Like so:
21
because 2, 7 and 12 has the largest sum of any 3-integer arithmetic series in the sequence, and 2 + 7 + 12 = 21. It is also guaranteed that a 3-integer arithmetic series exists in the sequence.
EDIT: The numbers that make up the sum (output) have to be an arithmetic series (constant difference) that is 3 integers long. In the case of the sample input, (1 6 11) is a possible arithmetic series, but it is smaller than (2 7 12) because 2 + 7 + 12 > 1 + 6 + 11. Thus 21 would be outputted because it is larger.
Here is my attempt at solving this question in C++:
#include <bits/stdc++.h>
using namespace std;
vector<int> results;
vector<int> middle;
vector<int> diff;
int main(){
int n;
cin >> n;
int sizes[n];
for (int i = 0; i < n; i++){
int size;
cin >> size;
sizes[i] = size;
}
sort(sizes, sizes + n, greater<int>());
for (int i = 0; i < n; i++){
for (int j = i+1; j < n; j++){
int difference = sizes[i] - sizes[j];
diff.insert(diff.end(), difference);
middle.insert(middle.end(), sizes[j]);
}
}
for (size_t i = 0; i < middle.size(); i++){
int difference = middle[i] - diff[i];
for (int j = 0; j < n; j++){
if (sizes[j] == difference) results.insert(results.end(), middle[i]);
}
}
int max = 0;
for (size_t i = 0; i < results.size(); i++) {
if (results[i] > max) max = results[i];
}
int answer = max * 3;
cout << answer;
return 0;
}
My approach was to record what the middle number and the difference was using separate vectors, then loop through the vectors and search if the middle number minus the difference is in the array, where it gets added to another vector. Then the largest middle number is found and multiplied by 3 to get the sum. This approach made my algorithm go from O(n^3) to roughly O(n^2). However, the algorithm doesn't always produce the correct output (and I can't think of a test case where this doesn't work) every time, and since I'm using separate vectors, I get a std::bad_alloc error for large N values because I am probably using too much memory. The time limit in this question is 1.4 sec per test case, and memory limit is 64 MB.
Since N can only be max 3000, O(n^2) is sufficient. So what is an optimal O(n^2) solution (or better) to this problem?
So, a simple solution for this problem is to put all elements into an std::map to count their frequencies, then iterate over the first and second element in the arithmetic progression, then search the map for the third.
Iterating takes O(n^2) and map lookups and find() generally takes O(logn).
include <iostream>
#include <map>
using namespace std;
const int maxn = 3000;
int a[maxn+1];
map<int, int> freq;
int main()
{
int n; cin >> n;
for (int i = 1; i <= n; i++) {cin >> a[i]; freq[a[i]]++;} // inserting frequencies
int maxi = INT_MIN;
for (int i = 1; i <= n-1; i++)
{
for (int j = i+1; j <= n; j++)
{
int first = a[i], sec = a[j]; if (first > sec) {swap(first, sec);} //ensure that first is smaller than sec
int gap = sec - first; //calculating difference
if (gap == 0 && freq[first] >= 3) {maxi = max(maxi, first*3); } //if first = sec then calculate immidiately
else
{
int third1 = first - gap; //else there're two options for the third element
if (freq.find(third1) != freq.end() && gap != 0) {maxi = max(maxi, first + sec + third1); } //finding third element
}
}
}
cout << maxi;
}
Output : 21
Another test :
6
3 4 5 7 7 7
Output : 21
Another test :
5
10 10 9 8 7
Output : 27
You can try std::unordered_map to try and reduce the complexity even more.
Also see Why is "using namespace std;" considered bad practice?
The sum of a 3-element arithmetic progression is 3-times the middle element, so I would search around a middle element, and would start the search from the "upper" end of the "array" (and have it sorted). This way the first hit is the largest one. Also, the actual array would be a frequency-map, so elements are unique, but still track if any element has 3 copies, because that can become a hit (progression by 0).
I think it may be better to create the frequency-map first, and sort it later, simply because it may result in sorting fewer elements - though they are going to be pairs of value and count in this case.
function max3(arr){
let stats=new Map();
for(let value of arr)
stats.set(value,(stats.get(value) || 0)+1);
let array=Array.from(stats); // array of [value,count] arrays
array.sort((x,y)=>y[0]-x[0]); // sort by value, descending
for(let i=0;i<array.length;i++){
let [value,count]=array[i];
if(count>=3)
return 3*value;
for(let j=0;j<i;j++)
if(stats.has(2*value-array[j][0]))
return 3*value;
}
}
console.log(max3([1,6,11,2,7,12])); // original example
console.log(max3([3,4,5,7,7,7])); // an example of 3 identical elements
console.log(max3([10,10,9,8,7])); // an example from another answer
console.log(max3([1,2,11,6,7,12])); // example with non-adjacent elements
console.log(max3([3,7,1,1,1])); // check for finding lowest possible triplet too

For a given number N, how do I find x, S.T product of (x and no. of factors to x) = N?

to find factors of number, i am using function void primeFactors(int n)
# include <stdio.h>
# include <math.h>
# include <iostream>
# include <map>
using namespace std;
// A function to print all prime factors of a given number n
map<int,int> m;
void primeFactors(int n)
{
// Print the number of 2s that divide n
while (n%2 == 0)
{
printf("%d ", 2);
m[2] += 1;
n = n/2;
}
// n must be odd at this point. So we can skip one element (Note i = i +2)
for (int i = 3; i <= sqrt(n); i = i+2)
{
// While i divides n, print i and divide n
while (n%i == 0)
{
int k = i;
printf("%d ", i);
m[k] += 1;
n = n/i;
}
}
// This condition is to handle the case whien n is a prime number
// greater than 2
if (n > 2)
m[n] += 1;
printf ("%d ", n);
cout << endl;
}
/* Driver program to test above function */
int main()
{
int n = 72;
primeFactors(n);
map<int,int>::iterator it;
int to = 1;
for(it = m.begin(); it != m.end(); ++it){
cout << it->first << " appeared " << it->second << " times "<< endl;
to *= (it->second+1);
}
cout << to << " total facts" << endl;
return 0;
}
You can check it here. Test case n = 72.
http://ideone.com/kaabO0
How do I solve above problem using above algo. (Can it be optimized more ?). I have to consider large numbers as well.
What I want to do ..
Take example for N = 864, we found X = 72 as (72 * 12 (no. of factors)) = 864)
There is a prime-factorizing algorithm for big numbers, but actually it is not often used in programming contests.
I explain 3 methods and you can implementate using this algorithm.
If you implementated, I suggest to solve this problem.
Note: In this answer, I use integer Q for the number of queries.
O(Q * sqrt(N)) solution per query
Your algorithm's time complexity is O(n^0.5).
But you are implementating with int (32-bit), so you can use long long integers.
Here's my implementation: http://ideone.com/gkGkkP
O(sqrt(maxn) * log(log(maxn)) + Q * sqrt(maxn) / log(maxn)) algorithm
You can reduce the number of loops because composite numbers are not neccesary for integer i.
So, you can only use prime numbers in the loop.
Algorithm:
Calculate all prime numbers <= sqrt(n) with Eratosthenes's sieve. The time complexity is O(sqrt(maxn) * log(log(maxn))).
In a query, loop for i (i <= sqrt(n) and i is a prime number). The valid integer i is about sqrt(n) / log(n) with prime number theorem, so the time complexity is O(sqrt(n) / log(n)) per query.
More efficient algorithm
There are more efficient algorithm in the world, but it is not used often in programming contests.
If you check "Integer factorization algorithm" on the internet or wikipedia, you can find the algorithm like Pollard's-rho or General number field sieve.
Well,I will show you the code.
# include <stdio.h>
# include <iostream>
# include <map>
using namespace std;
const long MAX_NUM = 2000000;
long prime[MAX_NUM] = {0}, primeCount = 0;
bool isNotPrime[MAX_NUM] = {1, 1}; // yes. can be improve, but it is useless when sieveOfEratosthenes is end
void sieveOfEratosthenes() {
//#see https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
for (long i = 2; i < MAX_NUM; i++) { // it must be i++
if (!isNotPrime[i]) //if it is prime,put it into prime[]
prime[primeCount++] = i;
for (long j = 0; j < primeCount && i * prime[j] < MAX_NUM; j++) { /*foreach prime[]*/
// if(i * prime[j] >= MAX_NUM){ // if large than MAX_NUM break
// break;
// }
isNotPrime[i * prime[j]] = 1; // set i * prime[j] not a prime.as you see, i * prime[j]
if (!(i % prime[j])) //if this prime the min factor of i,than break.
// and it is the answer why not i+=( (i & 1) ? 2 : 1).
// hint : when we judge 2,prime[]={2},we set 2*2=4 not prime
// when we judge 3,prime[]={2,3},we set 3*2=6 3*3=9 not prime
// when we judge 4,prime[]={2,3},we set 4*2=8 not prime (why not set 4*3=12?)
// when we judge 5,prime[]={2,3,5},we set 5*2=10 5*3=15 5*5=25 not prime
// when we judge 6,prime[]={2,3,5},we set 6*2=12 not prime,than we can stop
// why not put 6*3=18 6*5=30 not prime? 18=9*2 30=15*2.
// this code can make each num be set only once,I hope it can help you to understand
// this is difficult to understand but very useful.
break;
}
}
}
void primeFactors(long n)
{
map<int,int> m;
map<int,int>::iterator it;
for (int i = 0; prime[i] <= n; i++) // we test all prime small than n , like 2 3 5 7... it musut be i++
{
while (n%prime[i] == 0)
{
cout<<prime[i]<<" ";
m[prime[i]] += 1;
n = n/prime[i];
}
}
cout<<endl;
int to = 1;
for(it = m.begin(); it != m.end(); ++it){
cout << it->first << " appeared " << it->second << " times "<< endl;
to *= (it->second+1);
}
cout << to << " total facts" << endl;
}
int main()
{
//first init for calculate all prime numbers,for example we define MAX_NUM = 2000000
// the result of prime[] should be stored, you primeFactors will use it
sieveOfEratosthenes();
//second loop for i (i*i <= n and i is a prime number). n<=MAX_NUM
int n = 72;
primeFactors(n);
n = 864;
primeFactors(n);
return 0;
}
My best shot at performance without getting overboard with special algos.
The Erathostenes' seive - the complexity of the below is O(N*log(log(N))) - because the inner j loop starts from i*i instead of i.
#include <vector>
using std::vector;
void erathostenes_sieve(size_t upToN, vector<size_t>& primes) {
primes.clear();
vector<bool> bitset(upToN+1, true); // if the bitset[i] is true, the i is prime
bitset[0]=bitset[1]=0;
// if i is 2, will jump to 3, otherwise will jump on odd numbers only
for(size_t i=2; i<=upToN; i+=( (i&1) ? 2 : 1)) {
if(bitset[i]) { // i is prime
primes.push_back(i);
// it is enough to start the next cycle from i*i, because all the
// other primality tests below it are already performed:
// e.g:
// - i*(i-1) was surely marked non-prime when we considered multiples of 2
// - i*(i-2) was tested at (i-2) if (i-2) was prime or earlier (if non-prime)
for(size_t j=i*i; j<upToN; j+=i) {
bitset[j]=false; // all multiples of the prime with value of i
// are marked non-prime, using **addition only**
}
}
}
}
Now factoring based on the primes (set in a sorted vector). Before this, let's examine the myth of sqrt being expensive but a large bunch of multiplications is not.
First of all, let us note that sqrt is not that expensive anymore: on older CPU-es (x86/32b) it used to be twice as expensive as a division (and a modulo operation is division), on newer architectures the CPU costs are equal. Since factorisation is all about % operations again and again, one may still consider sqrt now and then (e.g. if and when using it saves CPU time).
For example consider the following code for an N=65537 (which is the 6553-th prime) assuming the primes has 10000 entries
size_t limit=std::sqrt(N);
size_t largestPrimeGoodForN=std::distance(
primes.begin(),
std::upper_limit(primes.begin(), primes.end(), limit) // binary search
);
// go descendingly from limit!!!
for(int i=largestPrimeGoodForN; i>=0; i--) {
// factorisation loop
}
We have:
1 sqrt (equal 1 modulo),
1 search in 10000 entries - at max 14 steps, each involving 1 comparison, 1 right-shift division-by-2 and 1 increment/decrement - so let's say a cost equal with 14-20 multiplications (if ever)
1 difference because of std::distance.
So, maximal cost - 1 div and 20 muls? I'm generous.
On the other side:
for(int i=0; primes[i]*primes[i]<N; i++) {
// factorisation code
}
Looks much simpler, but as N=65537 is prime, we'll go through all the cycle up to i=64 (where we'll find the first prime which cause the cycle to break) - a total of 65 multiplications.
Try this with a a higher prime number and I guarantee you the cost of 1 sqrt+1binary search are better use of the CPU cycle than all the multiplications on the way in the simpler form of the cycle touted as a better performance solution
So, back to factorisation code:
#include <algorithm>
#include <math>
#include <unordered_map>
void factor(size_t N, std::unordered_map<size_t, size_t>& factorsWithMultiplicity) {
factorsWithMultiplicity.clear();
while( !(N & 1) ) { // while N is even, cheaper test than a '% 2'
factorsWithMultiplicity[2]++;
N = N >> 1; // div by 2 of an unsigned number, cheaper than the actual /2
}
// now that we know N is even, we start using the primes from the sieve
size_t limit=std::sqrt(N); // sqrt is no longer *that* expensive,
vector<size_t> primes;
// fill the primes up to the limit. Let's be generous, add 1 to it
erathostenes_sieve(limit+1, primes);
// we know that the largest prime worth checking is
// the last element of the primes.
for(
size_t largestPrimeIndexGoodForN=primes.size()-1;
largestPrimeIndexGoodForN<primes.size(); // size_t is unsigned, so after zero will underflow
// we'll handle the cycle index inside
) {
bool wasFactor=false;
size_t factorToTest=primes[largestPrimeIndexGoodForN];
while( !( N % factorToTest) ) {
wasFactor=true;// found one
factorsWithMultiplicity[factorToTest]++;
N /= factorToTest;
}
if(1==N) { // done
break;
}
if(wasFactor) { // time to resynchronize the index
limit=std::sqrt(N);
largestPrimeIndexGoodForN=std::distance(
primes.begin(),
std::upper_bound(primes.begin(), primes.end(), limit)
);
}
else { // no luck this time
largestPrimeIndexGoodForN--;
}
} // done the factoring cycle
if(N>1) { // N was prime to begin with
factorsWithMultiplicity[N]++;
}
}

Intitutive method to find prime numbers in a range

While trying to find prime numbers in a range (see problem description), I came across the following code:
(Code taken from here)
// For each prime in sqrt(N) we need to use it in the segmented sieve process.
for (i = 0; i < cnt; i++) {
p = myPrimes[i]; // Store the prime.
s = M / p;
s = s * p; // The closest number less than M that is composite number for this prime p.
for (int j = s; j <= N; j = j + p) {
if (j < M) continue; // Because composite numbers less than M are of no concern.
/* j - M = index in the array primesNow, this is as max index allowed in the array
is not N, it is DIFF_SIZE so we are storing the numbers offset from.
while printing we will add M and print to get the actual number. */
primesNow[j - M] = false;
}
}
// In this loop the first prime numbers for example say 2, 3 are also set to false.
for (int i = 0; i < cnt; i++) { // Hence we need to print them in case they're in range.
if (myPrimes[i] >= M && myPrimes[i] <= N) // Without this loop you will see that for a
// range (1, 30), 2 & 3 doesn't get printed.
cout << myPrimes[i] << endl;
}
// primesNow[] = false for all composite numbers, primes found by checking with true.
for (int i = 0; i < N - M + 1; ++i) {
// i + M != 1 to ensure that for i = 0 and M = 1, 1 is not considered a prime number.
if (primesNow[i] == true && (i + M) != 1)
cout << i + M << endl; // Print our prime numbers in the range.
}
However, I didn't find this code intuitive and it was not easy to understand.
Can someone explain the general idea behind the above algorithm?
What alternative algorithms are there to mark non-prime numbers in a range?
That's overly complicated. Let's start with a basic Sieve of Eratosthenes, in pseudocode, that outputs all the primes less than or equal to n:
function primes(n)
sieve := makeArray(2..n, True)
for p from 2 to n
if sieve[p]
output(p)
for i from p*p to n step p
sieve[p] := False
This function calls output on each prime p; output can print the primes, or sum the primes, or count them, or do whatever you want to do with them. The outer for loop considers each candidate prime in turn; The sieving occurs in the inner for loop where multiples of the current prime p are removed from the sieve.
Once you understand how that works, go here for a discussion of the segmented Sieve of Eratosthenes over a range.
Have you considered the sieve on a bit level, it can provide a bit larger number of primes, and with the buffer, you could modify it to find for example the primes between 2 and 2^60 using 64 bit ints, by reusing the same buffer, while preserving the offsets of the primes already discovered. The following will use an array of integers.
Declerations
#include <math.h> // sqrt(), the upper limit need to eliminate
#include <stdio.h> // for printing, could use <iostream>
Macros to manipulate bit, the following will use 32bit ints
#define BIT_SET(d, n) (d[n>>5]|=1<<(n-((n>>5)<<5)))
#define BIT_GET(d, n) (d[n>>5]&1<<(n-((n>>5)<<5)))
#define BIT_FLIP(d, n) (d[n>>5]&=~(1<<(n-((n>>5)<<5))))
unsigned int n = 0x80000; // the upper limit 1/2 mb, with 32 bits each
// will get the 1st primes upto 16 mb
int *data = new int[n]; // allocate
unsigned int r = n * 0x20; // the actual number of bits avalible
Could use zeros to save time but, on (1) for prime, is a bit more intuitive
for(int i=0;i<n;i++)
data[i] = 0xFFFFFFFF;
unsigned int seed = 2; // the seed starts at 2
unsigned int uLimit = sqrt(r); // the upper limit for checking off the sieve
BIT_FLIP(data, 1); // one is not prime
Time to discover the primes this took under a half second
// untill uLimit is reached
while(seed < uLimit) {
// don't include itself when eliminating canidates
for(int i=seed+seed;i<r;i+=seed)
BIT_FLIP(data, i);
// find the next bit still active (set to 1), don't include the current seed
for(int i=seed+1;i<r;i++) {
if (BIT_GET(data, i)) {
seed = i;
break;
}
}
}
Now for the output this will consume the most time
unsigned long bit_index = 0; // the current bit
int w = 8; // the width of a column
unsigned pc = 0; // prime, count, to assist in creating columns
for(int i=0;i<n;i++) {
unsigned long long int b = 1; // double width, so there is no overflow
// if a bit is still set, include that as a result
while(b < 0xFFFFFFFF) {
if (data[i]&b) {
printf("%8.u ", bit_index);
if(((pc++) % w) == 0)
putchar('\n'); // add a new row
}
bit_index++;
b<<=1; // multiply by 2, to check the next bit
}
}
clean up
delete [] data;

How to reduce complexity of this code

Please can any one provide with a better algorithm then trying all the combinations for this problem.
Given an array A of N numbers, find the number of distinct pairs (i,
j) such that j >=i and A[i] = A[j].
First line of the input contains number of test cases T. Each test
case has two lines, first line is the number N, followed by a line
consisting of N integers which are the elements of array A.
For each test case print the number of distinct pairs.
Constraints:
1 <= T <= 10
1 <= N <= 10^6
-10^6 <= A[i] <= 10^6 for 0 <= i < N
I think that first sorting the array then finding frequency of every distinct integer and then adding nC2 of all the frequencies plus adding the length of the string at last. But unfortunately it gives wrong ans for some cases which are not known help. here is the implementation.
code:
#include <iostream>
#include<cstdio>
#include<algorithm>
using namespace std;
long fun(long a) //to find the aC2 for given a
{
if (a == 1) return 0;
return (a * (a - 1)) / 2;
}
int main()
{
long t, i, j, n, tmp = 0;
long long count;
long ar[1000000];
cin >> t;
while (t--)
{
cin >> n;
for (i = 0; i < n; i++)
{
cin >> ar[i];
}
count = 0;
sort(ar, ar + n);
for (i = 0; i < n - 1; i++)
{
if (ar[i] == ar[i + 1])
{
tmp++;
}
else
{
count += fun(tmp + 1);
tmp = 0;
}
}
if (tmp != 0)
{
count += fun(tmp + 1);
}
cout << count + n << "\n";
}
return 0;
}
Keep a count of how many times each number appears in an array. Then iterate over the result array and add the triangular number for each.
For example(from the source test case):
Input:
3
1 2 1
count array = {0, 2, 1} // no zeroes, two ones, one two
pairs = triangle(0) + triangle(2) + triangle(1)
pairs = 0 + 3 + 1
pairs = 4
Triangle numbers can be computed by (n * n + n) / 2, and the whole thing is O(n).
Edit:
First, there's no need to sort if you're counting frequency. I see what you did with sorting, but if you just keep a separate array of frequencies, it's easier. It takes more space, but since the elements and array length are both restrained to < 10^6, the max you'll need is an int[10^6]. This easily fits in the 256MB space requirements given in the challenge. (whoops, since elements can go negative, you'll need an array twice that size. still well under the limit, though)
For the n choose 2 part, the part you had wrong is that it's an n+1 choose 2 problem. Since you can pair each one by itself, you have to add one to n. I know you were adding n at the end, but it's not the same. The difference between tri(n) and tri(n+1) is not one, but n.