Counting the Number of Element Comparisons in Quick Sort - c++

I have been provided this predefined code for Quick Sort which isn't to be altered much:
I know we already have questions on this but this is different as the logic is predefined here.
void quicksort(int a[], int l, int r)
{
if (r <= l) return;
/* call for partition function that you modify */
quicksort(a, l, i-1);
quicksort(a, i+1, r);
}
int partition(int a[], int l, int r)
{ int i = l-1, j = r; int v = a[r];
for (;;)
{
while (a[++i] < v) ;
while (v < a[--j]) if (j == l) break;
if (i >= j) break;
exch(a[i], a[j]);
}
exch(a[i], a[r]);
return i;
}
We are just required to make slight modification so that the quicksort returns the number of comparisons that it, and partition function together (in total) have performed in sorting the given array a. **In these comparisons, only the comparisons that involve array elements are counted. You are not allowed to use any global variable in counting these comparisons. **
I have implemented this as follows, kindly let me know if I'm mistaken somewhere:
int partition(int a[], int l, int r, int& count) {
int i = l - 1, j = r; int v = a[r];
for (;;) {
while (a[++i] < v) count++;
while (v < a[--j]) {
count++;
if (j == l) break;
}
if (i >= j) break;
swap(a[i], a[j]);
}
swap(a[i], a[r]);
return i;
}
int quickSort(int a[], int l, int r) {
int count = 0;
if (r <= l) return 0;
int i = partition(a, l, r, count);
return count + quickSort(a, l, i - 1) + quickSort(a, i + 1, r);
}
Once you confirm, I'm going to share a surprising result of my research on this with you.

ricis comment works fine as a solution. There is an alternate approach one might take that can be generalized to the std::sort and other algorithms, and that is to make a counting comparer.
Something along the lines of
struct CountingComparer{
CountingComparer():count(0){}
CountingComparer(const CountingComparer& cc):count(cc.count){}
bool operator()(int lhs, int rhs){
count;
return lhs < rhs;
}
size_t count;
};
Now you need to change the signature of your function to add the comparer as a last argument. Like
template<typename COMP>
void quicksort( .... , COMP comp){
And the same change to the partition function. The comparisons are then made by
while (comp(a[++i],v)) and
while (comp(v, a[--j])) respectively.
Calling the sort
You need to make sure you have a reference to your comparer in the template argument.
CountingComparer comp;
quicksort(... , std::ref(comp));
Makes sure comp is not copied.
After the sort you find the number of comparisons in comp.count.
regarding your comment on counts
Your quicksort's behaviour is extensively discussed on the wikipedia page. It is expected that a sorted array behaves badly, while random elements behaves well.
Regarding the cleverness in the partition function
for (;;)
{
while (a[++i] < v) ;
while (v < a[--j]) if (j == l) break;
if (i >= j) break;
exch(a[i], a[j]);
}
exch(a[i], a[r]);
The first for statement isn't really counting anything so thats just a while(true) in disguise. It will end by a break statement.
Find the first large element to swap: The while (a[++i] < v); statement takes advantage of the fact that the pivot `v or a[r]' element is the rightmost element. So the pivot element acts like a guard here.
Find the first small element to swap: The while (v < a[--j]) if (j == l) break; does not have the guarantee of the pivot. Instead it checks for the leftmost limit.
The final check is just to see if the partition is done. If so break out of the infinite loop and finally
swap(a[i], a[r]);, arrange the pivot element to its correct position.

Related

Converting an iterative function to a recursive function without changing parameters

I want to convert the iterative template function getSmallest into a recursive function without changing anything in main (no changing function parameters etc.) because in class we are being taught to always keep the public interfaces of our functions the same (so if we work in a big project, we don't start changing things that break the whole program)
Here's the program I wish to convert:
// PRE: 0 <= start < end <= length of arr
// PARAM: arr = array of integers
// start = start index of sub-array
// end = end index of sub-array + 1
// POST: returns index of smallest value in arr{start:end}
template <class T>
int getSmallest(T arr[], int start, int end) {
int smallest = start;
for (int i = start + 1; i < end; ++i) {
if (arr[i] < arr[smallest]) {
smallest = i;
}
}
return smallest;
}
I have spent the last few hours scouring class notes, the internet and stackoverflow for any help but it all seems not related to my problem, so I am asking here.
Here is the best attempt I came up with:
template <class T>
int getSmallest(T arr[], int start, int end)
{
int smallest = 0; //this should only run on the first recursion, not the rest
i = start;
if (i == end-1)
{
return smallest;
}
else
{
if (arr[i] < arr[smallest])
{
smallest = i;
}
return getSmallest<T>(arr, i, end);
}
}
I can't seem to make int smallest = 0; only run on the first recursion and while my program compiles, it is functionally useless.
Any help would be appreciated. Thanks!
Rather than a recursive function of depth O(N) (as in OP's attempt), how about a recursive function of depth O(log(N))?
Divide the array in 2 each recursion.
int getSmallest(T arr[], int start, int end) {
if (start + 1 == end) {
return start;
}
int mid = start + (end-start)/2; // mid = (start + end)/2 may overflow
int left = getSmallest(arr, start, mid + 1);
int right = getSmallest(arr, mid, end);
return arr[left] < arr[right] ? left : right;
}
There are three conditions:
The array has no elements.
The array has 1 element.
The array has more than 1 element
The first condition we can return -1, for the second condition we return the index of the first element, and for the third condition we compare the current minimum with the minimum element found from the rest of the array, and return the index of the smallest:
template<class T>
int getSmallest(T arr[], int start, int end) {
if (end - start == 0)
return -1;
if (end - start == 1)
return start;
int idx = getSmallest(arr, start + 1, end);
if (arr[start] < arr[idx])
return start;
return idx;
}
The general way to transform an iterative function is to replace each loop with a helper function. For example:
for (int i = start + 1; i < end; i++) {
if (arr[i] < arr[smallest]) {
smallest = i;
}
}
We first rewrite this into a while loop:
int i = start + 1;
while (i < end) {
if (arr[i] < arr[smallest) {
smallest = i;
}
i++;
}
We then ask ourselves "what variables defined before the loop do we use in the loop?" Answer: arr, i, end, and smallest. These become the inputs of our function.
We then ask ourselves: what state was modified in the loop that we later use outside the loop? Answer: the value of smallest. This becomes our output.
We thus write the following helper function:
int helper_function(int i, int end, T arr[], int smallest) {
if (i < end) {
if (arr[i] < arr[smallest]) {
smallest = i;
}
i = i + 1;
return helper_function(i, end, arr, smallest);
} else {
return smallest;
}
}
which can be rewritten to
int helper_function(int i, int end, T arr[], int smallest) {
return i < end ? helper_function(i + 1,
end,
arr,
arr[i] < arr[smallest] ? i : smallest)
: smallest;
}
And replace the loop with
smallest = helper_function(i, end, arr, smallest);
So the code at this stage is
int helper_function(int i, int end, T arr[], int smallest) {
return i < end ? helper_function(i + 1,
end,
arr,
arr[i] < arr[smallest] ? i : smallest)
: smallest;
}
int get_smallest(T arr[], int start, int end) {
int smallest = start;
int i = start + 1;
smallest = helper_function(i, end, arr, smallest);
return smallest;
}
And of course we can dramatically simplify get_smallest - we don't need to redefine smallest and then return it, for example. So we get the following code for get_smallest:
int get_smallest(T arr[], int start, int end) {
return helper_function(start + 1, end, arr, start);
}
What do you do if there's more than one level of loop nesting? Get rid of the loops one at a time starting from the outermost loop.
Note that in this special case, it's possible to solve the problem with a different recursive algorithm. That algorithm would look like this:
int get_smallest(T arr[], int start, int end) {
if (start + 1 < end) {
int result = get_smallest(arr, start + 1, end);
return arr[result] < arr[start] ? result : start;
} else {
return start;
}
}
This is possible because of the fact that the min function is associative.
Before there were lambda functions in C++ (c++11 and newer has them), there was no other way than creating helper functions with an extra argument for the state (e.g. the accumulator or the smallest value or whatever is wanted). Other languages including old and dusty Pascal have nested functions.
template <class T>
T smallest(const T* first, const T* last) {
// iterative implementation
}
turns, with the help of lambda functions into:
template <class T>
T smallest(const T* first, const T* last) {
T result = 0; // dubious in itself, because we do not want to assume too much about what T actually is.
auto loop = [&result,last] (const T* current) {
if (current < last) {
if (*current < result) {
result = *current;
}
loop(current + 1);
}
};
loop(first);
return result;
}
Because the implementation is only really and truly generic if we do not assume too much about T, the line T result = 0; is already too much and we would need some generic 0 for every conceivable type. And even this would be wrong, because if we have an array containing negative numbers, you would need the smallest possible value and not 0 for the whole thing to work correctly.
What we also silently assumed is that there is an operator<(T x, T y) defined for T and that it is the one we want (imagine we want to use it for T = std::string - case sensitive comparison? utf8 aware? case insensitive?).
So, even though changing interfaces in code is often not a good idea, sometimes there simply should be an improvement. Because, lets admit it - the design of this function is bad, because it tries to sell us more genericity than it can truly grant.
template <class T>
T* smallest(const T* first, const T* last) {
if (first == last)
return last;
auto loop = [] (const T* first, const T* last, T* smallest) -> T* {
if (first != last) {
if (*first < *smallest) {
return loop( first + 1, last, first);
} else {
return loop( first + 1, last, smallest);
}
} else {
return smallest;
}
};
return loop(first + 1, last, first);
}
Is an improvement as it only assumes an operator<(...) but does not need to make assumptions about the "smallest possible value of T". It also shows more clearly how the "state" of the recursion is passed along (and thus makes this implementation tail recursive). And if lucky, the c++ compiler has an optimization for that and avoids unnecessary stack usage.
The next level of improvement would be to add an extra argument, which allows passing a compare function to smallest. And because we then as well could pass a "greater than" to this function, we could try and find a more general name.
template <class T>
T* selectLast( const T* first, const T* last, bool (*predicate)(const T*, const T*)) {
// ...
}
or
template <class T, class Pred>
T* selectLast( const T* first, const T* last, Pred predicate) {
// ...
}
Or we could generalize even more (since we are already in the realm of higher functions) and just offer a reduce/fold function. And as such a function already exists, we would use std::reduce() from the C++ standard library header file <numeric>.

Finding closest triplet in a set of three vectors?

Given three vectors of double, I want to pair every element in each vector such that the difference between the largest and smallest element in each triple is minimized, and every element of every vector is part of a triple. Right now, I'm using std::lower_bound():
double closest(vector<double> const& vec, double value){ auto const ret = std::lower_bound(vec.begin(), vec.end(), value); return(*ret); }
int main(){
vector<double> a, b, c; vector<vector<double>> triples;
for(auto x : a){
triples.push_back({x, closest(b, x), closest(c, x)});
}
}
Pretend a, b, and c here are populated with some values. The problem is, lower_bound() returns the nearest element not less than the argument. I would also like to consider elements less than the argument. Is there a nice way to to this?
My solution was to implement a binary search terminating in a comparison of neighboring elements. Another possible solution is to iterate over the elements of each vector/array, adjusting the index as necessary to minimize the difference (which may compare to binary search with an ideal complexity of $O(\log{n})$?):
int solve(int A[], int B[], int C[], int i, int j, int k)
{
int min_diff, current_diff, max_term;
min_diff = Integer.MAX_VALUE;
while (i != -1 && j != -1 && k != -1)
{
current_diff = abs(max(A[i], max(B[j], C[k]))
- min(A[i], min(B[j], C[k])));
if (current_diff < min_diff)
min_diff = current_diff;
max_term = max(A[i], max(B[j], C[k]));
if (A[i] == max_term)
i -= 1;
else if (B[j] == max_term)
j -= 1;
else
k -= 1;
}
return min_diff;
}

Quicksort c++ first element as pivot

I have something like this and I want to have first element as pivot.
Why this program is still does not working?
void algSzyb1(int tab[],int l,int p)
{
int x,w,i,j;
i=l; //l is left and p is pivot, //i, j = counter
j=p;
x=tab[l];
do
{
while(tab[i]<x) i++;
while(tab[j]>x) j--;
if(i<=j)
{
w=tab[i];
tab[i]=tab[j];
tab[j]=w;
i++;
j--;
}
}
while(!(i<j));
if(l<j) algSzyb1(tab,l,j);
if(i<p) algSzyb1(tab,i,p);
}
Looking at the code, not really checking what it does, just looking at the individual lines, this one line stands out:
while(!(i<j));
I look at that line, and I think: There is a bug somewhere round here. I haven't actually looked at the code so I don't know what the bug is, but I look at this single line and it looks wrong.
I think you need to decrement j before incrementing i.
while (tab[j]>x ) j--;
while (tab[i]<x && i < j) i++;
Also I have added an extra condition to ensure that i doesn't sweep past j. (Uninitialized memory read).
The pivot is slightly mis-named, as the end result is a sorted element, but this and the wikipedia page : quicksort both move the pivot into the higher partition, and don't guarantee the item in the correct place.
The end condition is when you have swept through the list
while( i < j ); /* not !(i<j) */
At the end of the search, you need to test a smaller set. The code you had created a stack overflow, because it repeatedly tried the same test.
if (l<j) algSzyb1(tab, l, j);
if (j+1<p) algSzyb1(tab, j+1, p);
Full code
void algSzyb1(int tab[], int l, int p)
{
int x, w, i, j;
i = l;
j = p;
x = tab[l]; //wróć tu później :D
do
{
while (tab[j]>x ) j--;
while (tab[i]<x && i < j) i++;
if (i < j)
{
w = tab[i];
tab[i] = tab[j];
tab[j] = w;
i++;
j--;
}
} while ((i<j));
if (l<j) algSzyb1(tab, l, j);
if (j+1<p) algSzyb1(tab, j+1, p);
}

C++ vectors Quicksort - Seems to work differently with different pivots

My quicksort algorithm with C++ vectors seem to work fine when I make the pivots as the first, last, or middle element, but not some other values.
I am not sure of all of them, but for example, if I set the pivot as (r-l)/2 it would not sort correctly.
I believe my code is correct, but I am not sure; there might be critical errors.
Is it even possible to sometimes work and sometimes not work, depending on the pivot?
I thought it only affected the running time, so I guess something is wrong with my code.
The following is my code:
#include <vector>
#include <algorithm>
using namespace std;
int choosePivot(int l, int r) {
return (r-l)/2; // or Even r/2
}
int partition(vector<int>& vec, int l, int r) {
int pi = choosePivot(l, r); // pivot index
int pivot = vec[pi];
// swap pivot with the beginning
swap(vec[pi], vec[l]);
// beginning index of the right side of the pivot (larger than the pivot)
int i = l + 1;
// partition around the pivot
for (int j = l+1; j <= r; ++j) {
if (vec[j] <= pivot) {
swap(vec[i], vec[j]);
++i;
}
}
// swap pivot back to its position
swap(vec[l], vec[i - 1]);
// return pivot position
return i - 1;
}
void quicksort(vector<int>& vec, int l, int r) {
if (l < r) {
int p = partition(vec, l, r);
quicksort(vec, l, p - 1);
quicksort(vec, p + 1, r);
}
}
int main() {
ifstream infile("IntegerArray.txt");
int a;
vector<int> vec;
vec.reserve(100000);
while (infile >> a)
vec.push_back(a);
quicksort(vec, 0, vec.size() - 1);
return 0;
}
I added a main function that tests the example.
This is the IntegerArray.txt
It's a file that contains all integers from 1 to 100,000 (no duplicates).
I edited the choosePivot function that it will output a wrongly sorted array.
I don't have a print because the size is too big.
The way quicksort is implemented in the code above, it breaks when the pivot index is not between l and r.
In such case, it starts by bringing in a value from outside the [l, r] segment with swap(vec[pi], vec[l]);.
This can break an already-sorted part of the array.
Now, (r-l)/2 is not always between l and r.
When, for example, l = 10 and r = 20, the pivot index is (20-10)/2 = 5.
So, the code will start sorting the [10, 20] segment by swapping vec[5] and vec[10].
If the part with vec[5] was sorted before [10, 20] segment, this will most likely result in the array not being sorted in the end.

Blueberries (SPOJ) - Dynamic Programming Time Limit Exceeded

I was solving a problem on spoj. Problem has a simple recursive solution.
Problem: Given an array of numbers of size n, select a set of numbers such that no two elements in the set are consecutive and sum of subset elements will be as close as possible to k, but should not exceed it.
My Recursive Approach
I used a approach similar to knapsack, at dividing the problem such that one includes the current element and other ignores it.
function solve_recursively(n, current, k)
if n < 0
return current
if n == 0
if current + a[n] <= k
return current + a[n]
else
return current
if current + a[n] > k
return recurse(n-1, current, k)
else
return max(recurse(n-1, current, k), recurse(n-2, current+a[n], k))
Later as it is exponential in nature, I used map (in C++) to do memoization to reduce complexity.
My source code:
struct k{
int n;
int curr;
};
bool operator < (const struct k& lhs, const struct k& rhs){
if(lhs.n != rhs.n)
return lhs.n < rhs.n;
return lhs.curr < rhs.curr;
};
int a[1001];
map<struct k,int> dp;
int recurse(int n, int k, int curr){
if(n < 0)
return curr;
struct k key = {n, curr};
if(n == 0)
return curr + a[0] <= k ? curr + a[0] : curr;
else if(dp.count(key))
return dp[key];
else if(curr + a[n] > k){
dp[key] = recurse(n-1, k, curr);
return dp[key];
}
else{
dp[key] = max(recurse(n-1, k, curr), recurse(n-2, k, curr+a[n]));
return dp[key];
}
}
int main(){
int t,n,k;
scanint(t);
while(t--){
scanint(n);
scanint(k);
for(int i = 0; i<n; ++i)
scanint(a[i]);
dp.clear();
printf("Scenario #%d: %d\n",j, recurse(n-1, k, 0));
}
return 0;
}
I checked for given test cases. It cleared them. But I am getting wrong answer on submission.
EDIT: Earlier my output format was wrong, so I was getting Wrong Answer. But, now its showing Time Limit Exceeded. I think bottom-up approach would be helpful, but I am having problem in formulating one. I am approaching it as bottom-up knapsack, but having some difficulties in exact formulation.
To my understanding, you almost have the solution. If the recurrence relation is correct but too inefficient, you just have change the recursion to iteration. Apparently, you already have the array dp which represents the states and their respective values. Basically you should be able to solve fill dp with three nested loops for n, k and curr, which would increase respectively to ensure that each value from dp which is needed has already been calculated. Then you replace the recursive calls to recurse with accesses to dp.