There is an array of n numbers. One number is repeated n/2 times and other n/2 numbers are distinct - c++

There is an array of n numbers. One number is repeated n/2 times and other n/2 numbers are distinct.Find the repeated number. (Best soln is o(n) exactly n/2+1 comparisons.)
the main problem here is n/2+1 comparisons.
i have two solutions for O(n) but they are taking more than n/2+1 comparisons.
1> divide the numbers of array in groups of three.compare those n/3 groups for any same elements.
e.g array is (1 10 3) (4 8 1) (1 1)....so number of comparisons required is 7 which is >n/2+1
i.e 8/2+1=5
2> compare a[i] with a[i+1] and a[i+2]
e.g array is 8 10 3 4 1 1 1 1
total 9 comparisons
i appreciate even a little help.
thank you
space complexity is O(1).

of course if all other are distinct you only have to compare all pairs. If you find one pair whit two equal numbers you have this number
lets say you have numbers like this (it is just about indexing)
[1,2,3,4,5,6,7,8,9,10]
you then make n/2 + 1 comparisons like this
(1,2),(3,4),(5,6),(7,8),(9,7),(9,8)
if all pairs are distinct you return 10.
Point is then when you compare last 4 remaining numbers (7,8,9,10) you know that among then are at least two same numbers and you have 3 comparisons.

You just need to find the number that exists twice in the array.
You just start from the beginning, keep a hash or something of numbers you've already seen, when you get to a number that appears twice just stop.
worst cat scenario: you see all the n/2 distinct numbers first, and then the next number is a repeat.... n/2+2 (because the number you're looking for isn't part of the n/2 unique numbers)

Read the part about O(1) space complexity too late, but anyway, here is my solution:
#include <iterator>
#include <unordered_set>
template <typename ForwardIterator>
ForwardIterator find_repeated_element(ForwardIterator begin, ForwardIterator end)
{
typedef typename std::iterator_traits<ForwardIterator>::value_type value_type;
std::unordered_set<value_type> visited_elements;
for (; begin != end; ++begin)
{
bool could_insert = visited_elements.insert(*begin).second;
if (!could_insert) return begin;
}
return end;
}
#include <iostream>
int main()
{
int test[] = {8, 10, 3, 4, 1, 1, 1, 1};
int* end = test + sizeof test / sizeof *test;
int* p = find_repeated_element(test, end);
if (p == end)
{
std::cout << "the was no repeated element\n";
}
else
{
std::cout << "repeated element: " << *p << "\n";
}
}

Due to Pigeon hole principle, you only need to test the first n/2+1 members of the array since the repeated number for certain will be repeated at least twice. Loop through each member, using a hash table to keep track, and stop when there is a member that is repeated twice.

Another solution for O(n) (but not exactly n/2+1), but with O(1) space:
Because you have n/2 of that number, then if you look at it as a sorted array, there are to scenarios for its position:
Either it's the lowest number, so it will take positions 1-n/2 .. or it's not, and then for sure it's in position n/2+1 .
So, you can use a selection algorithm, and retrieve 4 elements: the range [(n/2-1),(n/2+1)] in size
We want then number k in size, so that's ok with the algorithm.
Then the repeated number has to be at least twice in those 4 numbers (simple check)
So total complexity: 4*O(n) + O(1) = O(n)

Regarding complexity O(n/2+1) and space complexity O(1) you can (almost) meet the requirements with this approach:
Compare tuples:
a[x] == a[x+1], a[x+2] == a[x+3] ... a[n-1] == a[n]
If no match is found increase step:
a[x] == a[x+2], a[x+1] == a[x+3]
This will in worst case run in O(n/2+2) (but always in O(1) space) when you have an array like this: [8 1 10 1 3 1 4 1]

qsort( ) the array then scan for first repeat.

Related

Is below sorting algorithm O(n)?

Algorithm:
insert element counts in a map
start from first element
if first is present in a map, insert in output array (total number of count), increment first
if first is not in a map, find next number which is present in a map
Complexity: O(max element in array) which is linear, so, O(n).
vector<int> sort(vector<int>& can) {
unordered_map<int,int> mp;
int first = INT_MAX;
int last = INT_MIN;
for(auto &n : can) {
first = min(first, n);
last = max(last, n);
mp[n]++;
}
vector<int> out;
while(first <= last) {
while(mp.find(first) == mp.end()) first ++;
int cnt = mp[first];
while(cnt--) out.push_back(first);
first++;
}
return out;
}
Complexity: O(max element in array) which is linear, so, O(n).
No, it's not O(n). The while loop iterates last - first + 1 times, and this quantity depends on the array's contents, not the array's length.
Usually we use n to mean the length of the array that the algorithm works on. To describe the range (i.e. the difference between the largest and smallest values in the array), we could introduce a different variable r, and then the time complexity is O(n + r), because the first loop populating the map iterates O(n) times, the second loop populating the vector iterates O(r) times, and its inner loop which counts down from cnt iterates O(n) times in total.
Another more formal way to define n is the "size of the input", typically measured in the number of bits that it takes to encode the algorithm's input. Suppose the input is an array of length 2, containing just the numbers 0 and M for some number M. In this case, if the number of bits used to encode the input is n, then the number M can be on the order of O(2n), and the second loop does that many iterations; so by this formal definition the time complexity is exponential.

How to erase elements more efficiently from a vector or set?

Problem statement:
Input:
First two inputs are integers n and m. n is the number of knights fighting in the tournament (2 <= n <= 100000, 1 <= m <= n-1). m is the number of battles that will take place.
The next line contains n power levels.
The next m lines contain two integers l and r, indicating the range of knight positions to compete in the ith battle.
After each battle, all nights apart from the one with the highest power level will be eliminated.
The range for each battle is given in terms of the new positions of the knights, not the original positions.
Output:
Output m lines, the ith line containing the original positions (indices) of the knights from that battle. Each line is in ascending order.
Sample Input:
8 4
1 0 5 6 2 3 7 4
1 3
2 4
1 3
0 1
Sample Output:
1 2
4 5
3 7
0
Here is a visualisation of this process.
1 2
[(1,0),(0,1),(5,2),(6,3),(2,4),(3,5),(7,6),(4,7)]
-----------------
4 5
[(1,0),(6,3),(2,4),(3,5),(7,6),(4,7)]
-----------------
3 7
[(1,0),(6,3),(7,6),(4,7)]
-----------------
0
[(1,0),(7,6)]
-----------
[(7,6)]
I have solved this problem. My program produces the correct output, however, it is O(n*m) = O(n^2). I believe that if I erase knights more efficiently from the vector, efficiency can be increased. Would it be more efficient to erase elements using a set? I.e. erase contiguous segments rather that individual knights. Is there an alternative way to do this that is more efficient?
#define INPUT1(x) scanf("%d", &x)
#define INPUT2(x, y) scanf("%d%d", &x, &y)
#define OUTPUT1(x) printf("%d\n", x);
int main(int argc, char const *argv[]) {
int n, m;
INPUT2(n, m);
vector< pair<int,int> > knights(n);
for (int i = 0; i < n; i++) {
int power;
INPUT(power);
knights[i] = make_pair(power, i);
}
while(m--) {
int l, r;
INPUT2(l, r);
int max_in_range = knights[l].first;
for (int i = l+1; i <= r; i++) if (knights[i].first > max_in_range) {
max_in_range = knights[i].first;
}
int offset = l;
int range = r-l+1;
while (range--) {
if (knights[offset].first != max_in_range) {
OUTPUT1(knights[offset].second));
knights.erase(knights.begin()+offset);
}
else offset++;
}
printf("\n");
}
}
Well, removing from vector wouldn't be efficient for sure. Removing from set, or unordered set would be more effective (use iterators instead of indexes).
Yet the problem will still remain O(n^2), because you have two nested whiles running n*m times.
--EDIT--
I believe I understand the question now :)
First let's calculate the complexity of your code above. Your worst case would be the case that max range in all battles is 1 (two nights for each battle) and the battles are not ordered with respect to the position. Which means you have m battles (in this case m = n-1 ~= O(n))
The first while loop runs n times
For runs for once every time which makes it n*1 = n in total
The second while loop runs once every time which makes it n again.
Deleting from vector means n-1 shifts that makes it O(n).
Thus with the complexity of the vector total complexity is O(n^2)
First of all, you don't really need the inner for loop. Take the first knight as the max in range, compare the rest in the range one-by-one and remove the defeated ones.
Now, i believe it can be done in O(nlogn) with using std::map. The key to the map is the position and the value is the level of the knight.
Before proceeding, finding and removing an element in map is logarithmic, iterating is constant.
Finally, your code should look like:
while(m--) // n times
strongest = map.find(first_position); // find is log(n) --> n*log(n)
for (opponent = next of strongest; // this will run 1 times, since every range is 1
opponent in range;
opponent = next opponent) // iterating is constant
// removing from map is log(n) --> n * 1 * log(n)
if strongest < opponent
remove strongest, opponent is the new strongest
else
remove opponent, (be careful to remove it after iterating to next)
Ok, now the upper bound would be O(2*nlogn) = O(nlogn). If the ranges increases, that makes the run time of upper loop decrease but increases the number of remove operations. I'm sure the upper bound won't change, let's make it a homework for you to calculate :)
A solution with a treap is pretty straightforward.
For each query, you need to split the treap by implicit key to obtain the subtree that corresponds to the [l, r] range (it takes O(log n) time).
After that, you can iterate over the subtree and find the knight with the maximum strength. After that, you just need to merge the [0, l) and [r + 1, end) parts of the treap with the node that corresponds to this knight.
It's clear that all parts of the solution except for the subtree traversal and printing work in O(log n) time per query. However, each operation reinserts only one knight and erase the rest from the range, so the size of the output (and the sum of sizes of subtrees) is linear in n. So the total time complexity is O(n log n).
I don't think you can solve with standard stl containers because there'no standard container that supports getting an iterator by index quickly and removing arbitrary elements.

Finding the permutation that satisfy given condition

I want to find out the number of all permutation of nnumber.Number will be from 1 to n.The given condition is that each ithposition can have number up to Si,where Si is given for each position of number.
1<=n<=10^6
1<=si<=n
For example:
n=5
then its all five element will be
1,2,3,4,5
and given Si for each position is as:
2,3,4,5,5
It shows that at:
1st position can have 1 to 2that is 1,2 but can not be number among 3 to 5.
Similarly,
At 2nd position can have number 1 to 3 only.
At 3rd position can have number 1 to 4 only.
At 4th position can have number 1 to 5 only.
At 5th position can have number 1 to 5 only.
Some of its permutation are:
1,2,3,4,5
2,3,1,4,5
2,3,4,1,5 etc.
But these can not be:
3,1,4,2,5 As 3 is present at 1st position.
1,2,5,3,4 As 5 is present at 3rd position.
I am not getting any idea to count all possible number of permutations with given condition.
Okay, if we have a guarantee that numbers si are given in not descending order then looks like it is possible to calculate the number of permutations in O(n).
The idea of straightforward algorithm is as follows:
At step i multiply the result by current value of si[i];
We chose some number for position i. As long as we need permutation, that number cannot be repeated, so decrement all the rest si[k] where k from i+1 to the end (e.g. n) by 1;
Increase i by 1, go back to (1).
To illustrate on example for si: 2 3 3 4:
result = 1;
current si is "2 3 3 4", result *= si[0] (= 1*2 == 2), decrease 3, 3 and 4 by 1;
current si is "..2 2 3", result *= si[1] (= 2*2 == 4), decrease last 2 and 3 by 1;
current si is "....1 2", result *= si[2] (= 4*1 == 4), decrease last number by 1;
current si is "..... 1", result *= si[3] (= 4*1 == 4), done.
Hovewer this straightforward approach would require O(n^2) due to decreasing steps. To optimize it we can easily observe that at the moment of result *= si[i] our si[i] was already decreased exactly i times (assuming we start from 0 of course).
Thus O(n) way:
unsigned int result = 1;
for (unsigned int i = 0; i < n; ++i)
{
result *= (si[i] - i);
}
for each si count the number of element in your array such that a[i] <= si using binary search, and store the value to an array count[i], now the answer is the product of all count[i], however we have decrease the number of redundancy from the answer ( as same number could be count twice ), for that you can sort si and check how many number is <= s[i], then decrease that number from each count,the complexity is O(nlog(n)), hope at least I give you an idea.
To complete Yuriy Ivaskevych answer, if you don't know if the sis are in increasing order, you can sort the sis and it will also works.
And the result will be null or negative if the permutations are impossible (ex: 1 1 1 1 1)
You can try backtracking, it's a little hardcore approach but will work.
try:
http://www.thegeekstuff.com/2014/12/backtracking-example/
or google backtracking tutorial C++

Number of swaps in a permutation [duplicate]

This question already has answers here:
Counting the adjacent swaps required to convert one permutation into another
(6 answers)
Closed 8 years ago.
Is there an efficient algorithm (efficient in terms of big O notation) to find number of swaps to convert a permutation P into identity permutation I? The swaps do not need to be on adjacent elements, but on any elements.
So for example:
I = {0, 1, 2, 3, 4, 5}, number of swaps is 0
P = {0, 1, 5, 3, 4, 2}, number of swaps is 1 (2 and 5)
P = {4, 1, 3, 5, 0, 2}, number of swaps is 3 (2 with 5, 3 with 5, 4 with 0)
One idea is to write an algorithm like this:
int count = 0;
for(int i = 0; i < n; ++ i) {
for(; P[i] != i; ++ count) { // could be permuted multiple times
std::swap(P[P[i]], P[i]);
// look where the number at hand should be
}
}
But it is not very clear to me whether that is actually guaranteed to terminate or whether it finds a correct number of swaps. It works on the examples above. I tried generating all permutation on 5 and on 12 numbers and it always terminates on those.
This problem arises in numerical linear algebra. Some matrix decompositions use pivoting, which effectively swaps row with the greatest value for the next row to be manipulated, in order to avoid division by small numbers and improve numerical stability. Some decompositions, such as the LU decomposition can be later used to calculate matrix determinant, but the sign of the determinant of the decomposition is opposite to that of the original matrix, if the number of permutations is odd.
EDIT: I agree that this question is similar to Counting the adjacent swaps required to convert one permutation into another. But I would argue that this question is more fundamental. Converting permutation from one to another can be converted to this problem by inverting the target permutation in O(n), composing the permutations in O(n) and then finding the number of swaps from there to identity. Solving this question by explicitly representing identity as another permutation seems suboptimal. Also, the other question had, until yesterday, four answers where only a single one (by |\/|ad) was seemingly useful, but the description of the method seemed vague. Now user lizusek provided answer to my question there. I don't agree with closing this question as duplicate.
EDIT2: The proposed algorithm actually seems to be rather optimal, as pointed out in a comment by user rcgldr, see my answer to Counting the adjacent swaps required to convert one permutation into another.
I believe the key is to think of the permutation in terms of the cycle decomposition.
This expresses any permutation as a product of disjoint cycles.
Key facts are:
Swapping elements in two disjoint cycles produces one longer cycle
Swapping elements in the same cycle produces one fewer cycle
The number of permutations needed is n-c where c is the number of cycles in the decomposition
Your algorithm always swaps elements in the same cycle so will correctly count the number of swaps needed.
If desired, you can also do this in O(n) by computing the cycle decomposition and returning n minus the number of cycles found.
Computing the cycle decomposition can be done in O(n) by starting at the first node and following the permutation until you reach the start again. Mark all visited nodes, then start again at the next unvisited node.
I believe the following are true:
If S(x[0], ..., x[n-1]) is the minimum number of swaps needed to convert x to {0, 1, ..., n - 1}, then:
If x[n - 1] == n - 1, then S(x) == S(x[0],...,x[n-2]) (ie, cut off the last element)
If x[-1] != n - 1, then S(x) == S(x[0], ..., x[n-1], ..., x[i], ... x[n-2]) + 1, where x[i] == n - 1.
S({}) = 0.
This suggests a straightforward algorithm for computing S(x) that runs in O(n) time:
int num_swaps(int[] x, int n) {
if (n == 0) {
return 0;
} else if (x[n - 1] == n - 1) {
return num_swaps(x, n - 1);
} else {
int* i = std::find(x, x + n, n - 1);
std::swap(*i, x[n - 1])
return num_swaps(x, n - 1) + 1;
}
}

Longest Incresing Subsequence using std::set in c++

I have found a code for LIS in a book, I am not quite able to work out the proof for correctness . Can some one help me out with that. All the code is doing is deleting the element next to new inserted element in the set if the new element is not the max else just inserting the new element.
set<int> s;
set<int>::iterator it;
for(int i=0;i<n;i++)
{
s.insert(arr[i]);
it=s.find(arr[i]);
it++;
if(it!=s.end())
s.erase(it);
}
cout<<s.size()<<endl;
n is the size of sequence and arr is the sequence. I dont think the following code will work if we dont have to find "strictly" increasing sequences . Can we modify the code to find increasing sequences in which equality is allowed.
EDIT: the algorithm works only when the input are distinct.
There are several solutions to LIS.
The most typical is O(N^2) algorithm using dynamic programming, where for every index i you calculate "longest increasing sequence ending at index i".
You can speed this up to O(N log N) using clever data structures or binary search.
Your code bypasses this and only calculated the length of the LIS.
Consider input "1 3 4 5 6 7 2", the contents of the set at the end will be "1 2 4 5 6 7", which is not the LIS, but the length is correct.
Proof should go using induction as follows:
After i-th iteration the j-th smallest element is the smallest possible end of increasing sequence of the length j in the first i elements of the array.
Consider input "1 3 2". After second iteration we have set "1 3", so 1 is smallest possible end of increasing sequence of length 1 and 3 is smallest possible end of increasing sequence of length 2.
After third iteration we have set "1 2", where now the 2 is smallest possible end of increasing sequence of length 2.
I hope you can do induction step by yourself :)
The proof is relatively straightforward: consider set s as a sorted list. We can prove it with a loop invariant. After each iteration of the algorithm, s[k] contains the smallest element of arr that ends an ascending subsequence of length k in the sub-array from zero to the last element of arr that we have considered so far. We can prove this by induction:
After the first iteration, this statement is true, because s will contain exactly one element, which is a trivial ascending sequence of one element.
Each iteration can change the set in one of two ways: it could expand it by one in cases when arr[i] is the largest element found so far, or replace an existing element with arr[i], which is smaller than the element that has been there before.
When an extension of the set occurs, it happens because the current element arr[i] can be appended to the current LIS. When a replacement happens at position k, the index of arr[i], it happens because arr[i] ends an ascending subsequence of length k, and is smaller than or is equal to the old s[i] that used to end the previous "best" ascending subsequence of length k.
With this invariant in hand, it's easy to see that s contains as many elements as the longest ascending subsequence of arr after the entire arr has been exhausted.
The code is a O(nlogn) solution for LIS, but you want to find the non-strictly increasing sequence, the implementation has a problem because the std::set doesn't allow duplicate element. Here is the code that works.
#include <iostream>
#include <set>
#include <algorithm>
using namespace std;
int main()
{
int arr[] = {4, 4, 5, 7, 6};
int n = 5;
multiset<int> s;
multiset<int>::iterator it;
for(int i=0;i<n;i++)
{
s.insert(arr[i]);
it = upper_bound(s.begin(), s.end(), arr[i]);
if(it!=s.end())
s.erase(it);
}
cout<<s.size()<<endl;
return 0;
}
Problem Statement:
For A(n) :a0, a1,….an-1 we need to find LIS
Find all elements in A(n) such that, ai<aj and i<j.
For example: 10, 11, 12, 9, 8, 7, 5, 6
LIS will be 10,11,12
This is O(N^2) solution based on DP.
1 Finding SubProblems
Consider D(i): LIS of (a0 to ai) that includes ai as a part of LIS.
2 Recurrence Relation
D(i) = 1 + max(D(j) for all j<i) if ai > aj
3 Base Case
D(0) = 1;
Check out link for the code:
https://innosamcodes.wordpress.com/2013/07/06/longest-increasing-subsequence/