Fastest way to find if a lot numbers are in multiple intervals - c++

so I have a task. I've been given n numbers and m intervals and I need to figure out how many numbers are in the m i-th interval. I've written some code with a complexity of O(n*m), though I need to optimize it more. Any help?
Code:
#include <bits/stdc++.h>
using namespace std;
int main()
{
cin.tie(0);
ios_base::sync_with_stdio(0);
cout.tie(0);
int n,m,temp,temp1;
vector <pair<int, int>> uogienes;
vector <int> erskeciai;
cin >> n >> m;
for (int i = 0; i< n; i++){
cin>>temp;
erskeciai.push_back(temp);
}
temp = 0;
for (int i = 0; i<m; i++){
cin>> temp >> temp1;
uogienes.push_back(make_pair(temp, temp1));
}
for(int i = 0; i<m; i++){
temp=0;
for(int h = 0; h<n; h++){
if(uogienes[i].first <= erskeciai[h] && uogienes[i].second >= erskeciai[h]){
temp++;
}
}
cout << temp << "\n";
}
return 0;
}

As DAle already noted.
You can first sort the n numbers. A good algorithm, like merge or heap sort, will give you a complexity O(n*log(n)).
After that you need to use search algorithm for both your 'first' and 'second' parts of each interval. Depending on the algorithm, the complexity should be around O(log(n)) - std::lower_bound has complexity of O(log(n)) when working on sorted data, so its good enough. Or that will be O(m*log(n)) for all intervals.
Comparing the result of the search will give you the amount of numbers in each interval.
In total you'll have around O((m+n)*log(n)).

Related

Is there any O(n^2) algorithm to generate all sub-sequences of an array?

I was wondering if there is any O(n^2) complexity algorithm for generating all sub-sequences of an array. I know an algorithm but it takes O((2^n)*n) time.
int main() {
int n;
cin >> n;
vector<int> a(n);
for(int i = 0; i < n; ++i)
cin >> a[i];
int64_t opsize = pow(2,n);
for (int counter = 1; counter < opsize; counter++) {
for (int j = 0; j < n; j++) {
if (counter & (1 << j))
cout << a[j] << " ";
}
cout << endl;
}
}
No
There cannot be any algorithm of less than O(2^n) complexity simply because there are O(2^n) sub-sequences. You need to print each of them hence the time complexity has to be greater than or equal to O(2^n).
You can't improve algorithm complexity, but you can improve how stream is used.
As other answer points out o(n * 2^n) is best you can have.
When you use std::endl you are flushing buffers of streams. To achieve best performance buffers should flush them selves when they are full.
Since each subsequence have to be quite short (at maximum 64 elements), it means that you are flushing stream quite often and have serious performance hit.
So replacing std::endl with '\n' will provide significant performance improvement.
Other tricks which can help improve stream performance:
int main() {
std::ios::sync_with_stdio(false);
std::cin.tie(nullptr);
int n;
cin >> n;

C++, multiset, Time complexity

#include<bits/stdc++.h>
int main()
{
int n;
std::cin>>n;
char st[n];
getchar();
for (int i=0; i<n; ++i)
st[i]=getchar();
std::multiset<char> s;
int pos1=0,pos2=n-1;
for (char c:st) s.insert(c);
for (int i=0; i<n; ++i) {
if (s.count(st[i])==1) {
pos1=i;
break;
} else s.erase(s.find(st[i]));
}
for (int i=n-1; i>=0; --i) {
if (s.count(st[i])==1) {
pos2=i;
break;
} else s.erase(s.find(st[i]));
}
std::cout<<pos2-pos1+1;
}
I have just summit this code to CodeForces system, and it fail the TL (2s), i dont know why, because n constrain is 10^5. And my code work with O(nlogn). Can u guys help me? Thanks <3<3. Link of problem here : http://codeforces.com/problemset/problem/701/C
Indeed, your algorithm is O(nlogn) but this is not a garantee to not exceed a time limit. Remember that the multiplication constant for a big-O complexity may be too big to keep it under a certain time limit.
You are using a multiset only for keeping a count for each type of Pokemon. You lose much time to erase from that multiset and count again from it. You can do much faster than the multiset:
By using a map to keep the count for each type and to update it
Better yet, since pokemon types are encoded in single chars, you can use an array of 256 elements to keep track of the count. This way you can avoid the "log(n)" complexity of multiset (and map). Here's a refactored version of your code that should run much faster, and moreover it runs in O(n).
int main()
{
int n;
std::cin >> n;
std::vector<char> st(n);
std::array<int, 256> count; // we could also use a map if we had a bigger set of types...
// the way you read the input can also be speeded-up,
// but I want to focus on speeding up the algorithm
getchar();
for (int i=0; i<n; ++i) {
st[i]=getchar(); ++count[st[i]];
}
int pos1=0,pos2=n-1;
for (int i=0; i < n; ++i) {
if (count[st[i]] == 1) {
pos1 = i;
break;
} else --count[st[i]];
}
for (int i=n-1; i>=0; --i) {
if (s.count(st[i])==1) {
pos2=i;
break;
} else --count[st[i]];
}
std::cout<<pos2-pos1+1;
}

Count number of ways for choosing two numbers in efficient algorithm

I solved this problem but I got TLE Time Limit Exceed on online judge
the output of program is right but i think the way can be improved to be more efficient!
the problem :
Given n integer numbers, count the number of ways in which we can choose two elements such
that their absolute difference is less than 32.
In a more formal way, count the number of pairs (i, j) (1 ≤ i < j ≤ n) such that
|V[i] - V[j]| < 32. |X|
is the absolute value of X.
Input
The first line of input contains one integer T, the number of test cases (1 ≤ T ≤ 128).
Each test case begins with an integer n (1 ≤ n ≤ 10,000).
The next line contains n integers (1 ≤ V[i] ≤ 10,000).
Output
For each test case, print the number of pairs on a single line.
my code in c++ :
int main() {
int T,n,i,j,k,count;
int a[10000];
cin>>T;
for(k=0;k<T;k++)
{ count=0;
cin>>n;
for(i=0;i<n;i++)
{
cin>>a[i];
}
for(i=0;i<n;i++)
{
for(j=i;j<n;j++)
{
if(i!=j)
{
if(abs(a[i]-a[j])<32)
count++;
}
}
}
cout<<count<<endl;
}
return 0;
}
I need help how can I solve it in more efficient algorithm ?
Despite my previous (silly) answer, there is no need to sort the data at all. Instead you should count the frequencies of the numbers.
Then all you need to do is keep track of the number of viable numbers to pair with, while iterating over the possible values. Sorry no c++ but java should be readable as well:
int solve (int[] numbers) {
int[] frequencies = new int[10001];
for (int i : numbers) frequencies[i]++;
int solution = 0;
int inRange = 0;
for (int i = 0; i < frequencies.length; i++) {
if (i > 32) inRange -= frequencies[i - 32];
solution += frequencies[i] * inRange;
solution += frequencies[i] * (frequencies[i] - 1) / 2;
inRange += frequencies[i];
}
return solution;
}
#include <bits/stdc++.h>
using namespace std;
int a[10010];
int N;
int search (int x){
int low = 0;
int high = N;
while (low < high)
{
int mid = (low+high)/2;
if (a[mid] >= x) high = mid;
else low = mid+1;
}
return low;
}
int main() {
cin >> N;
for (int i=0 ; i<N ; i++) cin >> a[i];
sort(a,a+N);
long long ans = 0;
for (int i=0 ; i<N ; i++)
{
int t = search(a[i]+32);
ans += (t -i - 1);
}
cout << ans << endl;
return 0;
}
You can sort the numbers, and then use a sliding window. Starting with the smallest number, populate a std::deque with the numbers so long as they are no larger than the smallest number + 31. Then in an outer loop for each number, update the sliding window and add the new size of the sliding window to the counter. Update of the sliding window can be performed in an inner loop, by first pop_front every number that is smaller than the current number of the outer loop, then push_back every number that is not larger than the current number of the outer loop + 31.
One faster solution would be to first sort the array, then iterate through the sorted array and for each element only visit the elements to the right of it until the difference exceeds 31.
Sorting can probably be done via count sort (since you have 1 ≤ V[i] ≤ 10,000). So you get linear time for the sorting part. It might not be necessary though (maybe quicksort suffices in order to get all the points).
Also, you can do a trick for the inner loop (the "going to the right of the current element" part). Keep in mind that if S[i+k]-S[i]<32, then S[i+k]-S[i+1]<32, where S is the sorted version of V. With this trick the whole algorithm turns linear.
This can be done constant number of passes over the data, and actually can be done without being affected by the value of the "interval" (in your case, 32).
This is done by populating an array where a[i] = a[i-1] + number_of_times_i_appears_in_the_data - informally, a[i] holds the total number of elements that are smaller/equals to i.
Code (for a single test case):
static int UPPER_LIMIT = 10001;
static int K = 32;
int frequencies[UPPER_LIMIT] = {0}; // O(U)
int n;
std::cin >> n;
for (int i = 0; i < n; i++) { // O(n)
int x;
std::cin >> x;
frequencies[x] += 1;
}
for (int i = 1; i < UPPER_LIMIT; i++) { // O(U)
frequencies[i] += frequencies[i-1];
}
int count = 0;
for (int i = 1; i < UPPER_LIMIT; i++) { // O(U)
int low_idx = std::max(i-32, 0);
int number_of_elements_with_value_i = frequencies[i] - frequencies[i-1];
if (number_of_elements_with_value_i == 0) continue;
int number_of_elements_with_value_K_close_to_i =
(frequencies[i-1] - frequencies[low_idx]);
std::cout << "i: " << i << " number_of_elements_with_value_i: " << number_of_elements_with_value_i << " number_of_elements_with_value_K_close_to_i: " << number_of_elements_with_value_K_close_to_i << std::endl;
count += number_of_elements_with_value_i * number_of_elements_with_value_K_close_to_i;
// Finally, add "duplicates" of i, this is basically sum of arithmetic
// progression with d=1, a0=0, n=number_of_elements_with_value_i
count += number_of_elements_with_value_i * (number_of_elements_with_value_i-1) /2;
}
std::cout << count;
Working full example on IDEone.
You can sort and then use break to end loop when ever the range goes out.
int main()
{
int t;
cin>>t;
while(t--){
int n,c=0;
cin>>n;
int ar[n];
for(int i=0;i<n;i++)
cin>>ar[i];
sort(ar,ar+n);
for(int i=0;i<n;i++){
for(int j=i+1;j<n;j++){
if(ar[j]-ar[i] < 32)
c++;
else
break;
}
}
cout<<c<<endl;
}
}
Or, you can use a hash array for the range and mark occurrence of each element and then loop around and check for each element i.e. if x = 32 - y is present or not.
A good approach here is to split the numbers into separate buckets:
constexpr int limit = 10000;
constexpr int diff = 32;
constexpr int bucket_num = (limit/diff)+1;
std::array<std::vector<int>,bucket_num> buckets;
cin>>n;
int number;
for(i=0;i<n;i++)
{
cin >> number;
buckets[number/diff].push_back(number%diff);
}
Obviously the numbers that are in the same bucket are close enough to each other to fit the requirement, so we can just count all the pairs:
int result = std::accumulate(buckets.begin(), buckets.end(), 0,
[](int s, vector<int>& v){ return s + (v.size()*(v.size()-1))/2; });
The numbers that are in non-adjacent buckets cannot form any acceptable pairs, so we can just ignore them.
This leaves the last corner case - adjacent buckets - which can be solved in many ways:
for(int i=0;i<bucket_num-1;i++)
if(buckets[i].size() && buckets[i+1].size())
result += adjacent_buckets(buckets[i], buckets[i+1]);
Personally I like the "occurrence frequency" approach on the one bucket scale, but there may be better options:
int adjacent_buckets(const vector<int>& bucket1, const vector<int>& bucket2)
{
std::array<int,diff> pairs{};
for(int number : bucket1)
{
for(int i=0;i<number;i++)
pairs[i]++;
}
return std::accumulate(bucket2.begin(), bucket2.end(), 0,
[&pairs](int s, int n){ return s + pairs[n]; });
}
This function first builds an array of "numbers from lower bucket that are close enough to i", and then sums the values from that array corresponding to the upper bucket numbers.
In general this approach has O(N) complexity, in the best case it will require pretty much only one pass, and overall should be fast enough.
Working Ideone example
This solution can be considered O(N) to process N input numbers and constant in time to process the input:
#include <iostream>
using namespace std;
void solve()
{
int a[10001] = {0}, N, n, X32 = 0, ret = 0;
cin >> N;
for (int i=0; i<N; ++i)
{
cin >> n;
a[n]++;
}
for (int i=0; i<10001; ++i)
{
if (i >= 32)
X32 -= a[i-32];
if (a[i])
{
ret += a[i] * X32;
ret += a[i] * (a[i]-1)/2;
X32 += a[i];
}
}
cout << ret << endl;
}
int main()
{
int T;
cin >> T;
for (int i=0 ; i<T ; i++)
solve();
}
run this code on ideone
Solution explanation: a[i] represents how many times i was in the input series.
Then you go over entire array and X32 keeps track of number of elements that's withing range from i. The only tricky part really is to calculate properly when some i is repeated multiple times: a[i] * (a[i]-1)/2. That's it.
You should start by sorting the input.
Then if your inner loop detects the distance grows above 32, you can break from it.
Thanks for everyone efforts and time to solve this problem.
I appreciated all Attempts to solve it.
After testing the answers on online judge I found the right and most efficient solution algorithm is Stef's Answer and AbdullahAhmedAbdelmonem's answer also pavel solution is right but it's exactly same as Stef solution in different language C++.
Stef's code got time execution 358 ms in codeforces online judge and accepted.
also AbdullahAhmedAbdelmonem's code got time execution 421 ms in codeforces online judge and accepted.
if they put detailed explanation to there algorithm the bounty will be to one of them.
you can try your solution and submit it to codeforces online judge at this link after choosing problem E. Time Limit Exceeded?
also I found a great algorithm solution and more understandable using frequency array and it's complexity O(n).
in this algorithm you only need to take specific range for each inserted element to the array which is:
begin = element - 32
end = element + 32
and then count number of pair in this range for each inserted element in the frequency array :
int main() {
int T,n,i,j,k,b,e,count;
int v[10000];
int freq[10001];
cin>>T;
for(k=0;k<T;k++)
{
count=0;
cin>>n;
for(i=1;i<=10000;i++)
{
freq[i]=0;
}
for(i=0;i<n;i++)
{
cin>>v[i];
}
for(i=0;i<n;i++)
{
count=count+freq[v[i]];
b=v[i]-31;
e=v[i]+31;
if(b<=0)
b=1;
if(e>10000)
e=10000;
for(j=b;j<=e;j++)
{
freq[j]++;
}
}
cout<<count<<endl;
}
return 0;
}
finally i think the best approach to solve this kind of problems to use frequency array and count number of pairs in specific range because it's time complexity is O(n).

SPOJ SUMFOUR.....TLE on test case 9

I am trying to solve SPOJ Problem SUMFOUR....I am geting TLE on test case 9 http://www.spoj.com/problems/SUMFOUR/
So,Which part of my code has to be edited and how?Here N<=4000
#include <iostream>
#include<string>
#include<algorithm>
#include<cstdio>
#include<cmath>
#include<map>
#include<vector>
using namespace std;
int main()
{
int a[4005][5],n;
cin>>n;
for(int i=1;i<=n;i++)
for(int j=1;j<=4;j++)
scanf("%d",&a[i][j]);
int k=0;
for(int i=1;i<=n;i++)
{ int p=a[i][1];
for(int j=1;j<=n;j++)
{ b.push_back(p+a[j][2]);
k++;
}
}
k=0;
for(int i=1;i<=n;i++)
{ int p=a[i][3];
for(int j=1;j<=n;j++)
{ c.push_back(p+a[j][4]);
k++;
}
}
sort(b.begin(),b.end());
int cnt=0;
for(int j=0;j<k;j++)
if(find(b.begin(),b.end(),-c[j])!=b.end() )
cnt=cnt+count(b.begin(),b.end(),-c[j]) ;
printf("%d\n",cnt);
return 0;
}
The problem is here:
for(int j=0;j<k;j++)
if(find(b.begin(),b.end(),-c[j])!=b.end() )
cnt=cnt+count(b.begin(),b.end(),-c[j]) ;
for n = 4000, so there are 4000^2 elements in b and c. So, the time complexity for this loop is 4000^4, as find and count time complexity is O(n), which of course will cause you time limit exceed.
So, how you can reduce the time? You can use binary search to faster the count process, which reduce the time complexity of the above loop to O(n^2 log n), as I notice you already sort b.
Or , you can use map to count and store the frequency of each element in b and c.
map<long long, int> b;
map<long long, int> c;
for(int i=1;i<=n;i++)
{ long long p=a[i][1];
for(int j=1;j<=n;j++)
{
long long tmp =p + a[j][2];
b[tmp] = b[tmp] + 1;
}
}
// Similar for c
map <long long, int>::iterator it;
long long result;
for (it = c.begin(); it != c.end(); ++it)
result += c[it->first]*b[-(it->first)];
For your new update, please change this:
for(int j=1;j<=n;j++)
{ if( b.count(a[i][1]+a[j][2]) )
{ b[a[i][1]+a[j][2]]+=1;
c[a[i][3]+a[j][4]]+=1;
}
else
{ b[a[i][1]+a[j][2]]=1;
c[a[i][3]+a[j][4]]=1;
}
}
into this:
for(int j=1;j<=n;j++)
{
b[a[i][1]+a[j][2]]+=1;
c[a[i][3]+a[j][4]]+=1;
}
The condition check if( b.count(a[i][1]+a[j][2]) ) is for b only, and you use it for c, which make c incorrect.
Update: After trying to get accepted in SPOJ, it turns out that map is not fast enough, so I make a change into binary search, and got accepted.
My accepted code
Please Don't Use Map as its worst Case Complexity Can be O(log(n)) .
SO instead You can just Use two sorted arrays and for every element as in the
first array , Binary Search for its -ve agent in the Second Cumulative array .
Just Change the find method in Last lines to Binary search(c.begin(),c.end(),key) and find the repititons till the end with that index as it gives the lower_bound index .
That Total Sum gives the answer and its expected Complexity is
O(n^2log(n)).

C++ algorithm optimization: find K combination from N elements

I am pretty noobie with C++ and am trying to do some HackerRank challenges as a way to work on that.
Right now I am trying to solve Angry Children problem: https://www.hackerrank.com/challenges/angry-children
Basically, it asks to create a program that given a set of N integer, finds the smallest possible "unfairness" for a K-length subset of that set. Unfairness is defined as the difference between the max and min of a K-length subset.
The way I'm going about it now is to find all K-length subsets and calculate their unfairness, keeping track of the smallest unfairness.
I wrote the following C++ program that seems to the problem correctly:
#include <cmath>
#include <cstdio>
#include <iostream>
using namespace std;
int unfairness = -1;
int N, K, minc, maxc, ufair;
int *candies, *subset;
void check() {
ufair = 0;
minc = subset[0];
maxc = subset[0];
for (int i = 0; i < K; i++) {
minc = min(minc,subset[i]);
maxc = max(maxc, subset[i]);
}
ufair = maxc - minc;
if (ufair < unfairness || unfairness == -1) {
unfairness = ufair;
}
}
void process(int subsetSize, int nextIndex) {
if (subsetSize == K) {
check();
} else {
for (int j = nextIndex; j < N; j++) {
subset[subsetSize] = candies[j];
process(subsetSize + 1, j + 1);
}
}
}
int main() {
cin >> N >> K;
candies = new int[N];
subset = new int[K];
for (int i = 0; i < N; i++)
cin >> candies[i];
process(0, 0);
cout << unfairness << endl;
return 0;
}
The problem is that HackerRank requires the program to come up with a solution within 3 seconds and that my program takes longer than that to find the solution for 12/16 of the test cases. For example, one of the test cases has N = 50 and K = 8; the program takes 8 seconds to find the solution on my machine. What can I do to optimize my algorithm? I am not very experienced with C++.
All you have to do is to sort all the numbers in ascending order and then get minimal a[i + K - 1] - a[i] for all i from 0 to N - K inclusively.
That is true, because in optimal subset all numbers are located successively in sorted array.
One suggestion I'd give is to sort the integer list before selecting subsets. This will dramatically reduce the number of subsets you need to examine. In fact, you don't even need to create subsets, simply look at the elements at index i (starting at 0) and i+k, and the lowest difference for all elements at i and i+k [in valid bounds] is your answer. So now instead of n choose k subsets (factorial runtime I believe) you just have to look at ~n subsets (linear runtime) and sorting (nlogn) becomes your bottleneck in performance.