Finding an element in partially sorted array - c++

I had a following interview question.
There is an array of nxn elements. The array is partially sorted i.e the biggest element in row i is smaller than the smallest element in row i+1.
How can you find a given element with complexity O(n)
Here is my take on this:
You should go to the row n/2.And start compare for example you search for 100 and the first number you see is 110 so you know it's either in this row or in rows above now you go n/4 and so on.
From the comments
Isn't it O(n * log n) in total? He has
to parse through every row that he
reaches per binary search, therefore
the number of linear searches is
multiplied with the number of rows he
will have to scan in average. – Martin
Matysiak 5 mins ago.
I am not sure that is a right solution. Does anyone have something better

Your solution indeed takes O(n log n) assuming you're searching each row you parse. If you don't search each row, then you can't accurately perform the binary step.
O(n) solution:
Pick the n/2 row, instead of searching the entire row, we simply take the first element of the previous row, and the first element of the next row. O(1).
We know that all elements of the n/2 row must be between these selected values (this is the key observation). If our target value lies in the interval, then search all three rows (3*O(n) = O(n)).
If our value is outside this range, then continue in the binary search manner by selecting n/4 if our value was less than the range, and 3n/4 row if the value was greater, and again comparing against one element of adjacent rows.
Finding the right block of 3 rows will cost O(1) * O(log n), and finding the element will cost O(n).
In total O(log n) + O(n) = O(n).

Here is a simple implementation - since we need O(n) for finding an element within a row anyhow, I left out the bin-search...
void search(int n[][], int el) {
int minrow = 0, maxrow;
while (minrow < n.length && el >= n[minrow][0])
++minrow;
minrow = Math.max(0, minrow - 1);
maxrow = Math.min(n.length - 1, minrow + 1);
for (int row = minrow; row <= maxrow; ++row) {
for (int col = 0; col < n[row].length; ++col) {
if (n[row][col] == el) {
System.out.printf("found at %d,%d\n", row, col);
}
}
}
}

Related

Is below sorting algorithm O(n)?

Algorithm:
insert element counts in a map
start from first element
if first is present in a map, insert in output array (total number of count), increment first
if first is not in a map, find next number which is present in a map
Complexity: O(max element in array) which is linear, so, O(n).
vector<int> sort(vector<int>& can) {
unordered_map<int,int> mp;
int first = INT_MAX;
int last = INT_MIN;
for(auto &n : can) {
first = min(first, n);
last = max(last, n);
mp[n]++;
}
vector<int> out;
while(first <= last) {
while(mp.find(first) == mp.end()) first ++;
int cnt = mp[first];
while(cnt--) out.push_back(first);
first++;
}
return out;
}
Complexity: O(max element in array) which is linear, so, O(n).
No, it's not O(n). The while loop iterates last - first + 1 times, and this quantity depends on the array's contents, not the array's length.
Usually we use n to mean the length of the array that the algorithm works on. To describe the range (i.e. the difference between the largest and smallest values in the array), we could introduce a different variable r, and then the time complexity is O(n + r), because the first loop populating the map iterates O(n) times, the second loop populating the vector iterates O(r) times, and its inner loop which counts down from cnt iterates O(n) times in total.
Another more formal way to define n is the "size of the input", typically measured in the number of bits that it takes to encode the algorithm's input. Suppose the input is an array of length 2, containing just the numbers 0 and M for some number M. In this case, if the number of bits used to encode the input is n, then the number M can be on the order of O(2n), and the second loop does that many iterations; so by this formal definition the time complexity is exponential.

How to erase elements more efficiently from a vector or set?

Problem statement:
Input:
First two inputs are integers n and m. n is the number of knights fighting in the tournament (2 <= n <= 100000, 1 <= m <= n-1). m is the number of battles that will take place.
The next line contains n power levels.
The next m lines contain two integers l and r, indicating the range of knight positions to compete in the ith battle.
After each battle, all nights apart from the one with the highest power level will be eliminated.
The range for each battle is given in terms of the new positions of the knights, not the original positions.
Output:
Output m lines, the ith line containing the original positions (indices) of the knights from that battle. Each line is in ascending order.
Sample Input:
8 4
1 0 5 6 2 3 7 4
1 3
2 4
1 3
0 1
Sample Output:
1 2
4 5
3 7
0
Here is a visualisation of this process.
1 2
[(1,0),(0,1),(5,2),(6,3),(2,4),(3,5),(7,6),(4,7)]
-----------------
4 5
[(1,0),(6,3),(2,4),(3,5),(7,6),(4,7)]
-----------------
3 7
[(1,0),(6,3),(7,6),(4,7)]
-----------------
0
[(1,0),(7,6)]
-----------
[(7,6)]
I have solved this problem. My program produces the correct output, however, it is O(n*m) = O(n^2). I believe that if I erase knights more efficiently from the vector, efficiency can be increased. Would it be more efficient to erase elements using a set? I.e. erase contiguous segments rather that individual knights. Is there an alternative way to do this that is more efficient?
#define INPUT1(x) scanf("%d", &x)
#define INPUT2(x, y) scanf("%d%d", &x, &y)
#define OUTPUT1(x) printf("%d\n", x);
int main(int argc, char const *argv[]) {
int n, m;
INPUT2(n, m);
vector< pair<int,int> > knights(n);
for (int i = 0; i < n; i++) {
int power;
INPUT(power);
knights[i] = make_pair(power, i);
}
while(m--) {
int l, r;
INPUT2(l, r);
int max_in_range = knights[l].first;
for (int i = l+1; i <= r; i++) if (knights[i].first > max_in_range) {
max_in_range = knights[i].first;
}
int offset = l;
int range = r-l+1;
while (range--) {
if (knights[offset].first != max_in_range) {
OUTPUT1(knights[offset].second));
knights.erase(knights.begin()+offset);
}
else offset++;
}
printf("\n");
}
}
Well, removing from vector wouldn't be efficient for sure. Removing from set, or unordered set would be more effective (use iterators instead of indexes).
Yet the problem will still remain O(n^2), because you have two nested whiles running n*m times.
--EDIT--
I believe I understand the question now :)
First let's calculate the complexity of your code above. Your worst case would be the case that max range in all battles is 1 (two nights for each battle) and the battles are not ordered with respect to the position. Which means you have m battles (in this case m = n-1 ~= O(n))
The first while loop runs n times
For runs for once every time which makes it n*1 = n in total
The second while loop runs once every time which makes it n again.
Deleting from vector means n-1 shifts that makes it O(n).
Thus with the complexity of the vector total complexity is O(n^2)
First of all, you don't really need the inner for loop. Take the first knight as the max in range, compare the rest in the range one-by-one and remove the defeated ones.
Now, i believe it can be done in O(nlogn) with using std::map. The key to the map is the position and the value is the level of the knight.
Before proceeding, finding and removing an element in map is logarithmic, iterating is constant.
Finally, your code should look like:
while(m--) // n times
strongest = map.find(first_position); // find is log(n) --> n*log(n)
for (opponent = next of strongest; // this will run 1 times, since every range is 1
opponent in range;
opponent = next opponent) // iterating is constant
// removing from map is log(n) --> n * 1 * log(n)
if strongest < opponent
remove strongest, opponent is the new strongest
else
remove opponent, (be careful to remove it after iterating to next)
Ok, now the upper bound would be O(2*nlogn) = O(nlogn). If the ranges increases, that makes the run time of upper loop decrease but increases the number of remove operations. I'm sure the upper bound won't change, let's make it a homework for you to calculate :)
A solution with a treap is pretty straightforward.
For each query, you need to split the treap by implicit key to obtain the subtree that corresponds to the [l, r] range (it takes O(log n) time).
After that, you can iterate over the subtree and find the knight with the maximum strength. After that, you just need to merge the [0, l) and [r + 1, end) parts of the treap with the node that corresponds to this knight.
It's clear that all parts of the solution except for the subtree traversal and printing work in O(log n) time per query. However, each operation reinserts only one knight and erase the rest from the range, so the size of the output (and the sum of sizes of subtrees) is linear in n. So the total time complexity is O(n log n).
I don't think you can solve with standard stl containers because there'no standard container that supports getting an iterator by index quickly and removing arbitrary elements.

Finding MIN MAX pairs from array

Given a sorted array of N integers, I need to find to all pairs with different indexes(i!=j). I need the maximum (a[j]+a[i]-1) and minimum (a[j]-a[i]+1) out of all pairs with (j>i). Numbers aren't unique but their pairing is allowed. Numbers can't pair with themselves.
What I'm doing right now :
for(i=0;i<n;i++)
{
for(j=i+1;j<n;j++)
{
MAX= max(MAX,a[j] + a[i] -1);
MIN=min(MIN,a[j]-a[i]+1);
}
}
This gives the time complexity of O(n^2). Is there a way to reduce it to O(nlogn) or even less ?
To find the max you just need to add the elements at index n-1 and n-2, as the array is already sorted and the 2 biggest elements will be only at the end of the array. No other element in the array will be bigger than these and hence their sum will also be greater than the sum of any other elements.
MAX = a[n-1] + a[n-2] - 1;
Time complexity : O(1)
For finding the min , you should look for pivot in the array. I choose to start from a[0]. If space is not a constraint create another array of similar size and populate it with the delta values from your pivot.
int[] b = new int[n];
for(int i=1; i<n; i++)
{
b[i] = a[i] - a[0];
}
Now the second array will have the delta values from your pivot. All you have to find is the indices of the Minimum and next-Minimum values of Array b. These 2 will be the closest values to each and hence their difference will also be the least.
Time Complexity : O(n) + O(n) = O(n)
Space Complexity : O(n) as a new array of same size has to be created.

What do move and key comparison mean in c++?

Followings are written in a ppt about Insertion Sort in my class:
void insertionSort(DataType theArray[], int n) {
for (int unsorted = 1; unsorted < n; ++unsorted) {
DataType nextItem = theArray[unsorted];
int loc = unsorted;
for (;(loc > 0) && (theArray[loc-1] > nextItem); --loc)
theArray[loc] = theArray[loc-1];
theArray[loc] = nextItem;
}
}
-
Running time depends on not only the size of the array but also the contents of the array.
Best-case:  O(n)
Array is already sorted in ascending order.
Inner loop will not be executed.
>>>> The number of moves: 2*(n-1)  O(n)
>>>> The number of key comparisons: (n-1)  O(n)
Worst-case:  O(n2)
Array is in reverse order:
Inner loop is executed p-1 times, for p = 2,3, …, n
The number of moves: 2*(n-1)+(1+2+...+n-1)= 2*(n-1)+ n*(n-1)/2  O(n2)
The number of key comparisons: (1+2+...+n-1)= n*(n-1)/2  O(n2)
Average-case:  O(n2)
We have to look at all possible initial data organizations.
So, Insertion Sort is O(n2)
What exacly are move and key comparison?? I couldn't find an explanaiton on Google.
Let me word the algorithm first.
Assume at a given time there are two part of array. index 0 to index loc - 1 is sorted in ascending order and index loc to n - 1 is unsorted.
Start with element at loc, find its correct place in sorted part of the array and insert it there.
So now there are two loops:
First outer loop, starts with loc = 1 to loc = n, basically partitions the array in sorted and unsorted part.
Second inner loop finds position of element at loc in the sorted part of array ( 0 to loc - 1).
For the inner loop, to find correct location, you have to compare element at loc with, in worst case, all the elements in sorted part of array. This is key comparison.
To insert, you have to create a void in sorted part of the array for element at loc. This is done by swapping each element in sorted part to the next element. This is move.
Move is the number of swaps it has to perform in order to sort the data and the keys are the data that is compered.

choose n largest elements in two vector

I have two vectors, each contains n unsorted elements, how can I get n largest elements in these two vectors?
my solution is merge two vector into one with 2n elements, and then use std::nth_element algorithm, but I found that's not quite efficient, so anyone has more efficient solution. Really appreciate.
You may push the elements into priority_queue and then pop n elements out.
Assuming that n is far smaller than N this is quite efficient. Getting minElem is cheap and sorted inserting in L cheaper than sorting of the two vectors if n << N.
L := SortedList()
For Each element in any of the vectors do
{
minElem := smallest element in L
if( element >= minElem or if size of L < n)
{
add element to L
if( size of L > n )
{
remove smallest element from L
}
}
}
vector<T> heap;
heap.reserve(n + 1);
vector<T>::iterator left = leftVec.begin(), right = rightVec.begin();
for (int i = 0; i < n; i++) {
if (left != leftVec.end()) heap.push_back(*left++);
else if (right != rightVec.end()) heap.push_back(*right++);
}
if (left == leftVec.end() && right == rightVec.end()) return heap;
make_heap(heap.begin(), heap.end(), greater<T>());
while (left != leftVec.end()) {
heap.push_back(*left++);
push_heap(heap.begin(), heap.end(), greater<T>());
pop_heap(heap.begin(), heap.end(), greater<T>());
heap.pop_back();
}
/* ... repeat for right ... */
return heap;
Note I use *_heap directly rather than priority_queue because priority_queue does not provide access to its underlying data structure. This is O(N log n), slightly better than the naive O(N log N) method if n << N.
You can do the "n'th element" algorithm conceptually in parallel on the two vectors quite easiely (at least the simple variant that's only linear in the average case).
Pick a pivot.
Partition (std::partition) both vectors by that pivot. You'll have the first vector partitioned by some element with rank i and the second by some element with rank j. I'm assuming descending order here.
If i+j < n, recurse on the right side for the n-i-j greatest elements. If i+j > n, recurse on the left side for the n greatest elements. If you hit i+j==n, stop the recursion.
You basically just need to make sure to partition both vectors by the same pivot in every step. Given a decent pivot selection, this algorithm is linear in the average case (and works in-place).
See also: http://en.wikipedia.org/wiki/Selection_algorithm#Partition-based_general_selection_algorithm
Edit: (hopefully) clarified the algorithm a bit.