Same Occurrence of two numbers in sub arrays (contiguous) - c++

Given an array of integers ,find total number of contiguous sub-sequence that has same number of x and y.
For example array [1,2,1] for x=1 and y=2
ans = 2 for its two sub arrays [1,2] and [2,1].
Checking every contiguous sub-sequence is O(n^2) which is too inefficient. Any idea for improvements?
this is the code i have written
int get_total(int* a,int x,int y,int n){
int result=0;
for(int i=0;i<n;i++){
int x_c=0,y_c=0;
for(int j=i;j<n;j++){
if(a[j]==x){
x_c++;
}
if(a[j]==y){
y_c++;
}
if(x_c==y_c){
result++;
}
}
}
return result;
}
int main(){
int n,q;
cin >>n >>q;
int a[n];
for(int i=0;i<n;i++){
cin >>a[i];
}
while(q--){
int x,y;
cin >>x >>y;
cout <<get_total(a,x,y,n)<<"\n";
}
}
it runs in n^2 for every query.
max array size is 8*10^3 and max number of query is 10^5

Create an auxillary array x_y_diffs, which is essentially:
#(times_x_appeared_thus_far) - #(times_y_appeared_thus_far)
And can be calculated as:
x_y_diffs[0] = 0
x_y_diffs[i] = x_y_diffs[i-1] + 1 if array[i-1] == x
x_y_diffs[i-1] - 1 if array[i-1] == y
x_y_diffs[i-1] otherwise
It is easy to see it can be calculated in linear time.
Now, observe that a "good" subsequence (i,j) begins and ends where x_y_diffs[i+1] == x_y_diffs[j+1].
So, you can simply iterate the array and maintain a histogram counting how many times each value occurd.
std::map<int, int> histogram;
int count = 0;
for (int x : x_y_diffs) {
count += histogram[x];
histogram[x] = histogram[x] + 1;
}
This takes O(nlogn) time to calculate (each map insert/seek is O(logn)), and can be improved to O(n) average case by switching from std::map to std::unordred_map.
So, the algorithm is total O(n) or O(nlogn) time (based on map selection) - and O(n) additional space.
Demo on ideone

Related

Time complexity for finding minimum and maximum value of vector's elements in C++

It's the code covering the above problem:
#include <iostream>
using namespace std;
#include <vector>
int main()
{
unsigned int n;
cin >> n;
int elementsOfVector;
vector <double> wektor;
for(int i = 0; i<n; i++) {
cin >> elementsOfVector;
wektor.push_back(elementsOfVector);
}
double min = wektor[0];
double max = wektor[1];
if (min > max) {
min = wektor[1];
max = wektor[0];
}
for(int i = 2; i<n; i++) {
if (max < wektor[i]) {
max = wektor[i];
}
else if (min > wektor[i]) {
min = wektor[i];
}
}
cout << "Min " << min << " max " << max;
return 0;
}
According to my analysis:
Firstly, we have a for loop to assign all vector's elements with values, we make n-iterations so the time complexity of the action is O(n). Then we have a if statement with condition within it where we compare one value to other but there are always just those two values no matter what n input is so we can assume it's O(1) constant complexity in Big-O notation - not sure if this is correct so I would be grateful If anyone could relate. In the second for loop we make n-2 iterations and the operations inside the for loop are simple arithmetic operations and cost 1 so we can avoid it in big O notation: To sum up n + n = 2n O(2n) so total time complexity is O(n). Am I right?
First part for loop. O(n)
Second part if(min>max) O(1)
Third part for loop O(n)
Total O(n)+O(1)+O(n) = O(n)
The third part since you iterator n-2 times, it's O(n).

Find indices i<j in an array of size n so that the sum of values at those indices is equal to i + j

My solution :
#include <bits/stdc++.h>
int main() {
int n;//Size of array
std::cin>>n;
std::vector<long long>vec_int;
int temp = n;
while(n--){
long long k ;
std::cin>>k;
vec_int.push_back(k);
}
n = temp;
int num = 0;
for(int i = 0 ; i < n-1 ; i++){
for(int j = i+1; j<n; j++){
if(i<j && i+j == vec_int[i]+vec_int[j])
num++;
}
}
std::cout<<num;
return 0;
}
I am scanning the array which takes about O(n^2) time. On very large arrays the time limit for the question exceeds the 2s duration. I tried sorting the array but didn't get too far. How can I speed this up? Is it possible to do this in O(n) time complexity.
Consider redefinition of your problem. The expression:
i+j == vec_int[i]+vec_int[j]
is algebraically equivalent to:
vec_int[i] - i == -(vec_int[j] - j)
So define:
a[i] = vec_int[i] - i
And now the question is to count how many times a[i] == -a[j].
This can be tested in O(n). Use unordered_map m to count how many times each negative value is present in a. Then for each positive value a[i] will be paired with m[-a[i]] negative values. Also count number of zeroes in a and compute number of pairs between those.

Count number of ways for choosing two numbers in efficient algorithm

I solved this problem but I got TLE Time Limit Exceed on online judge
the output of program is right but i think the way can be improved to be more efficient!
the problem :
Given n integer numbers, count the number of ways in which we can choose two elements such
that their absolute difference is less than 32.
In a more formal way, count the number of pairs (i, j) (1 ≤ i < j ≤ n) such that
|V[i] - V[j]| < 32. |X|
is the absolute value of X.
Input
The first line of input contains one integer T, the number of test cases (1 ≤ T ≤ 128).
Each test case begins with an integer n (1 ≤ n ≤ 10,000).
The next line contains n integers (1 ≤ V[i] ≤ 10,000).
Output
For each test case, print the number of pairs on a single line.
my code in c++ :
int main() {
int T,n,i,j,k,count;
int a[10000];
cin>>T;
for(k=0;k<T;k++)
{ count=0;
cin>>n;
for(i=0;i<n;i++)
{
cin>>a[i];
}
for(i=0;i<n;i++)
{
for(j=i;j<n;j++)
{
if(i!=j)
{
if(abs(a[i]-a[j])<32)
count++;
}
}
}
cout<<count<<endl;
}
return 0;
}
I need help how can I solve it in more efficient algorithm ?
Despite my previous (silly) answer, there is no need to sort the data at all. Instead you should count the frequencies of the numbers.
Then all you need to do is keep track of the number of viable numbers to pair with, while iterating over the possible values. Sorry no c++ but java should be readable as well:
int solve (int[] numbers) {
int[] frequencies = new int[10001];
for (int i : numbers) frequencies[i]++;
int solution = 0;
int inRange = 0;
for (int i = 0; i < frequencies.length; i++) {
if (i > 32) inRange -= frequencies[i - 32];
solution += frequencies[i] * inRange;
solution += frequencies[i] * (frequencies[i] - 1) / 2;
inRange += frequencies[i];
}
return solution;
}
#include <bits/stdc++.h>
using namespace std;
int a[10010];
int N;
int search (int x){
int low = 0;
int high = N;
while (low < high)
{
int mid = (low+high)/2;
if (a[mid] >= x) high = mid;
else low = mid+1;
}
return low;
}
int main() {
cin >> N;
for (int i=0 ; i<N ; i++) cin >> a[i];
sort(a,a+N);
long long ans = 0;
for (int i=0 ; i<N ; i++)
{
int t = search(a[i]+32);
ans += (t -i - 1);
}
cout << ans << endl;
return 0;
}
You can sort the numbers, and then use a sliding window. Starting with the smallest number, populate a std::deque with the numbers so long as they are no larger than the smallest number + 31. Then in an outer loop for each number, update the sliding window and add the new size of the sliding window to the counter. Update of the sliding window can be performed in an inner loop, by first pop_front every number that is smaller than the current number of the outer loop, then push_back every number that is not larger than the current number of the outer loop + 31.
One faster solution would be to first sort the array, then iterate through the sorted array and for each element only visit the elements to the right of it until the difference exceeds 31.
Sorting can probably be done via count sort (since you have 1 ≤ V[i] ≤ 10,000). So you get linear time for the sorting part. It might not be necessary though (maybe quicksort suffices in order to get all the points).
Also, you can do a trick for the inner loop (the "going to the right of the current element" part). Keep in mind that if S[i+k]-S[i]<32, then S[i+k]-S[i+1]<32, where S is the sorted version of V. With this trick the whole algorithm turns linear.
This can be done constant number of passes over the data, and actually can be done without being affected by the value of the "interval" (in your case, 32).
This is done by populating an array where a[i] = a[i-1] + number_of_times_i_appears_in_the_data - informally, a[i] holds the total number of elements that are smaller/equals to i.
Code (for a single test case):
static int UPPER_LIMIT = 10001;
static int K = 32;
int frequencies[UPPER_LIMIT] = {0}; // O(U)
int n;
std::cin >> n;
for (int i = 0; i < n; i++) { // O(n)
int x;
std::cin >> x;
frequencies[x] += 1;
}
for (int i = 1; i < UPPER_LIMIT; i++) { // O(U)
frequencies[i] += frequencies[i-1];
}
int count = 0;
for (int i = 1; i < UPPER_LIMIT; i++) { // O(U)
int low_idx = std::max(i-32, 0);
int number_of_elements_with_value_i = frequencies[i] - frequencies[i-1];
if (number_of_elements_with_value_i == 0) continue;
int number_of_elements_with_value_K_close_to_i =
(frequencies[i-1] - frequencies[low_idx]);
std::cout << "i: " << i << " number_of_elements_with_value_i: " << number_of_elements_with_value_i << " number_of_elements_with_value_K_close_to_i: " << number_of_elements_with_value_K_close_to_i << std::endl;
count += number_of_elements_with_value_i * number_of_elements_with_value_K_close_to_i;
// Finally, add "duplicates" of i, this is basically sum of arithmetic
// progression with d=1, a0=0, n=number_of_elements_with_value_i
count += number_of_elements_with_value_i * (number_of_elements_with_value_i-1) /2;
}
std::cout << count;
Working full example on IDEone.
You can sort and then use break to end loop when ever the range goes out.
int main()
{
int t;
cin>>t;
while(t--){
int n,c=0;
cin>>n;
int ar[n];
for(int i=0;i<n;i++)
cin>>ar[i];
sort(ar,ar+n);
for(int i=0;i<n;i++){
for(int j=i+1;j<n;j++){
if(ar[j]-ar[i] < 32)
c++;
else
break;
}
}
cout<<c<<endl;
}
}
Or, you can use a hash array for the range and mark occurrence of each element and then loop around and check for each element i.e. if x = 32 - y is present or not.
A good approach here is to split the numbers into separate buckets:
constexpr int limit = 10000;
constexpr int diff = 32;
constexpr int bucket_num = (limit/diff)+1;
std::array<std::vector<int>,bucket_num> buckets;
cin>>n;
int number;
for(i=0;i<n;i++)
{
cin >> number;
buckets[number/diff].push_back(number%diff);
}
Obviously the numbers that are in the same bucket are close enough to each other to fit the requirement, so we can just count all the pairs:
int result = std::accumulate(buckets.begin(), buckets.end(), 0,
[](int s, vector<int>& v){ return s + (v.size()*(v.size()-1))/2; });
The numbers that are in non-adjacent buckets cannot form any acceptable pairs, so we can just ignore them.
This leaves the last corner case - adjacent buckets - which can be solved in many ways:
for(int i=0;i<bucket_num-1;i++)
if(buckets[i].size() && buckets[i+1].size())
result += adjacent_buckets(buckets[i], buckets[i+1]);
Personally I like the "occurrence frequency" approach on the one bucket scale, but there may be better options:
int adjacent_buckets(const vector<int>& bucket1, const vector<int>& bucket2)
{
std::array<int,diff> pairs{};
for(int number : bucket1)
{
for(int i=0;i<number;i++)
pairs[i]++;
}
return std::accumulate(bucket2.begin(), bucket2.end(), 0,
[&pairs](int s, int n){ return s + pairs[n]; });
}
This function first builds an array of "numbers from lower bucket that are close enough to i", and then sums the values from that array corresponding to the upper bucket numbers.
In general this approach has O(N) complexity, in the best case it will require pretty much only one pass, and overall should be fast enough.
Working Ideone example
This solution can be considered O(N) to process N input numbers and constant in time to process the input:
#include <iostream>
using namespace std;
void solve()
{
int a[10001] = {0}, N, n, X32 = 0, ret = 0;
cin >> N;
for (int i=0; i<N; ++i)
{
cin >> n;
a[n]++;
}
for (int i=0; i<10001; ++i)
{
if (i >= 32)
X32 -= a[i-32];
if (a[i])
{
ret += a[i] * X32;
ret += a[i] * (a[i]-1)/2;
X32 += a[i];
}
}
cout << ret << endl;
}
int main()
{
int T;
cin >> T;
for (int i=0 ; i<T ; i++)
solve();
}
run this code on ideone
Solution explanation: a[i] represents how many times i was in the input series.
Then you go over entire array and X32 keeps track of number of elements that's withing range from i. The only tricky part really is to calculate properly when some i is repeated multiple times: a[i] * (a[i]-1)/2. That's it.
You should start by sorting the input.
Then if your inner loop detects the distance grows above 32, you can break from it.
Thanks for everyone efforts and time to solve this problem.
I appreciated all Attempts to solve it.
After testing the answers on online judge I found the right and most efficient solution algorithm is Stef's Answer and AbdullahAhmedAbdelmonem's answer also pavel solution is right but it's exactly same as Stef solution in different language C++.
Stef's code got time execution 358 ms in codeforces online judge and accepted.
also AbdullahAhmedAbdelmonem's code got time execution 421 ms in codeforces online judge and accepted.
if they put detailed explanation to there algorithm the bounty will be to one of them.
you can try your solution and submit it to codeforces online judge at this link after choosing problem E. Time Limit Exceeded?
also I found a great algorithm solution and more understandable using frequency array and it's complexity O(n).
in this algorithm you only need to take specific range for each inserted element to the array which is:
begin = element - 32
end = element + 32
and then count number of pair in this range for each inserted element in the frequency array :
int main() {
int T,n,i,j,k,b,e,count;
int v[10000];
int freq[10001];
cin>>T;
for(k=0;k<T;k++)
{
count=0;
cin>>n;
for(i=1;i<=10000;i++)
{
freq[i]=0;
}
for(i=0;i<n;i++)
{
cin>>v[i];
}
for(i=0;i<n;i++)
{
count=count+freq[v[i]];
b=v[i]-31;
e=v[i]+31;
if(b<=0)
b=1;
if(e>10000)
e=10000;
for(j=b;j<=e;j++)
{
freq[j]++;
}
}
cout<<count<<endl;
}
return 0;
}
finally i think the best approach to solve this kind of problems to use frequency array and count number of pairs in specific range because it's time complexity is O(n).

Extract the n lowest sums from combinations of elements from m arrays for huge datasets

Let's say you have a number of unsorted arrays containing integers. Your job is to make sums of the arrays. The sums have to contain exactly one value from each array, i.e. (for 3 arrays)
sum = array1[2]+array2[12]+array3[4];
Goal: You should output the 20 combinations that generate the lowest possible sums.
The solution below is off-limits as the algorithm needs to be able to handle 10 arrays that can contain a huge number of integers. The following solution is way too slow for larger number of arrays:
//You already have int array1, array2 and array3
int top[20];
for(int i=0; i<20; i++)
top[i] = 1e99;
int sum = 0;
for(int i=0; i<array1.size(); i++) //One for loop per array is trouble for
for(int j=0; j<array2.size(); j++) //increasing numbers of arrays
for(int k=0; k<array3.size(); k++)
{
sum = array1[i] + array2[j] + array3[k];
if (sum < top[19])
swapFunction(sum, top); //Function that adds sum to top
//and sorts top in increasing order
}
printResults(top); // Outputs top 20 lowest sums in increasing order
What would you do to achieve correct results more efficiently (with a lower Big O notation)?
The answer can be found by considering how to find the absolute lowest sum, and how to find the 2nd lowest sum and so on.
As you only need 20 sums at most, you only need the lowest 20 values from each array at most. I would recommend using std::partial_sort for this.
The rest should be able to be accomplished with a priority_queue in which each element contains the current sum and the indicies of the arrays for this sum. Simply take each index of indicies and increase it by one, calculate the new sum and add that to the priority queue. The top most item of the queue should always be the one of the lowest sum. Remove the lowest sum, generate the next possibilities, and then repeat until you have enough answers.
Assuming that the number of answers needed is much less than Big O should be predominately be the efficiency of partial_sort (N + k*log(k)) * number of arrays
Here's some basic code to demonstrate the idea. There's very likely ways of improving on this. For example, I'm sure that with some work, you could avoid adding the same set of indicies multiple times, and there by eliminate the need for the do-while pop.
for (size_t i = 0; i < arrays.size(); i++)
{
auto b = arrays[i].begin();
partial_sort(b, b + numAnswers, arrays[i].end());
}
struct answer
{
answer(int s, vector<int> i)
: sum(s), indices(i)
{
}
int sum;
vector<int> indices;
bool operator <(const answer &o) const
{
return sum > o.sum;
}
};
auto getSum =[&arrays](const vector<int> &indices) {
auto retval = 0;
for (size_t i = 0; i < arrays.size(); i++)
{
retval += arrays[i][indices[i]];
}
return retval;
};
vector<int> initalIndices(arrays.size());
priority_queue<answer> q;
q.emplace(getSum(initalIndices), initalIndices );
for (auto i = 0; i < numAnswers; i++)
{
auto ans = q.top();
cout << ans.sum << endl;
do
{
q.pop();
} while (!q.empty() && q.top().indices == ans.indices);
for (size_t i = 0; i < ans.indices.size(); i++)
{
auto nextIndices = ans.indices;
nextIndices[i]++;
q.emplace(getSum(nextIndices), nextIndices);
}
}

Median of two sorted array

Here I have written code for finding median of two sorted arrays:
#include<iostream>
using namespace std;
#define L 5
#define M 6
const int N=L+M;
int A[1000];//define 1 indexed aarray
int B[1000];
int max(int c,int d){
return (c>=d)?c:d;
}
int min(int c,int d)
{
return (c<=d)?c:d;
}
void read(){
cout<<" enter A array "<<endl;
for (int i=1;i<=L;i++)
cin>>A[i];
cout<<endl;
cout<<"enter B array "<<endl;
for (int i=1;i<=M;i++)
cin>>B[i];
cout<<endl;
}
int median(int a[],int b[],int left,int right){
if (left>right) {
return median(b,a,max(1,(N/2)-L),min(M,N/2));
}
int i=int(left+right)/2;
int j=int(N/2)+i;
if((j==0 || a[i]>b[j]) && (j==M || a[i]<=b[j+1])){
return a[i];
}
else
{
if((j==0 || a[i]>b[j]) &&(j!=M && a[i]>b[j+1]))
return median(a,b,left,i-1);
}
return median(a,b,i+1,right);
}
int main(){
return 0;
}
My question is what could be left and right values? It is from introduction to algorithms, I just don't understand what are values of left and right variables?
I have defined left and right as 1 and N and tested with following arrays:
3 5 7 9 11 13
1 2 4 8 10
Answer is 13, which is not correct sure, what is wrong?
The homework problem you cited in a comment has what looks to be a pretty good explanation of left and right, including the starting values for them:
Let the default values for left and right be such that calling
MEDIAN-SEARCH(A,B) is equivalent to
MEDIAN-SEARCH(A[1 ..l],B[1 ..m],max(1,ceil(n/2) - m),min(l,ceil(n/2)))
The invariant in MEDIAN-SEARCH(A,B) is that the median is always in
either A[left ..right] or B. This is true for the initial call because
A and B are sorted, so by the definition of median it must be between
max(1,ceil(n/2) - m) and min(l,ceil(n/2)), inclusive. It is also true
the recursive calls on lines 8 and 9, since the algorithm only
eliminates parts of the array that cannot be the median by the
definition of median. The recursive call on line 2 also preserves the
invariant since if left > right the median must be in B be­tween the
new left and right values.
If you work through the algorithm on paper with small arrays, it should become more clear what's going on. The algorithm converges in only a few steps if your arrays are smaller than a total of say 16 elements, so it should be quite workable on paper.
Please consider the following
std::cout << "enter all number separated by a space ending with 'q'"
<< std::endl;
std::vector<int> v(
(std::istream_iterator<int>(std::cin)),
std::istream_iterator<int>());
std::sort(v.begin(), v.end());
std::cout << "median value is: "
<< std::advance(v.begin(), v.size()/2);
<< std::endl;
Here is the code for finding the median of two sorted arrays of unequal length using the merge method of mergesort
package FindMedianBetween2SortedArrays;
import java.util.Scanner;
public class UsingMergeMethodOfMergeSort {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
try{
System.out.println("Enter the number of elements in the first SORTED array");
int n = in.nextInt();
int[] array1 = new int[n];
System.out.println("Enter the elements of the first SORTED array");
for(int i=0;i<n;i++)
array1[i]=in.nextInt();
System.out.println("Enter the number of elements in the second SORTED array");
int m = in.nextInt();
int[] array2 = new int[m];
System.out.println("Enter the elements of the second SORTED array");
for(int i=0;i<m;i++)
array2[i]=in.nextInt();
System.out.println("Median of the two SORTED arrays is: "+findMedianUsingMergeOfMergeSort(array1,array2));
}
finally{
in.close();
}
}
private static int findMedianUsingMergeOfMergeSort(int[] a, int[] b) {
/* a1 array and a2 array can be of different lengths.
For Example:
1.
a1.length = 3
a2.length = 6
totalElements = 3+6=9 (odd number)
2.
a1.length = 4
a2.length = 4
totalElements = 4+4=8 (even number)
*/
int totalElements = a.length+b.length; // totalElements is the addition of the individual array lengths
int currentMedian = 0;
int prevMedian = 0;
int i=0; // Index for traversing array1
int j=0; // Index for traversing array2
for(int k=0;k<totalElements;k++){ // k is index for traversing the totalElements of array1 and array2
/*NOTE: In this entire for loop, the "if", "else" and "else if" is VERY IMP. DONOT interchange among them*/
// if array1 is exhausted
if(i==a.length)
currentMedian=b[j++]; // elements of the second array would be considered
// if array2 is exhausted
else if(j==b.length)
currentMedian=a[i++]; // elements of the first array would be considered
else if(a[i]<b[j])
currentMedian=a[i++];
else //(b[j]<=a[i]) // this condition is ONLY "else" and not "if" OR "else if"
currentMedian=b[j++];
if(k==totalElements/2) // we reached the middle of the totalElements where the median of the combined arrays is found
break;
prevMedian = currentMedian;
}
// if the totalElements are odd
if(totalElements%2!=0)
return currentMedian;
else
return (prevMedian+currentMedian)/2;
}
}
/*
Analysis:
Time Complexity = Linear Time, O((m+n)/2)
Space Complexity = O(1)
*/
class Solution {
public:
double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
double median;
for(int i = 0; i<nums2.size();++i){
nums1.push_back(nums2[i]);
};
sort(nums1.begin(),nums1.end());
if(nums1.size()%2 == 0){
median = (double)(nums1[(nums1.size()/2)-1] + nums1[nums1.size()/2])/2;
}else{
if(nums1.size() == 1) median = nums1[0];
median = nums1[((nums1.size()/2) + (nums1.size()%2)) - 1];
}
return median;
};
};