Quicksort c++ first element as pivot - c++

I have something like this and I want to have first element as pivot.
Why this program is still does not working?
void algSzyb1(int tab[],int l,int p)
{
int x,w,i,j;
i=l; //l is left and p is pivot, //i, j = counter
j=p;
x=tab[l];
do
{
while(tab[i]<x) i++;
while(tab[j]>x) j--;
if(i<=j)
{
w=tab[i];
tab[i]=tab[j];
tab[j]=w;
i++;
j--;
}
}
while(!(i<j));
if(l<j) algSzyb1(tab,l,j);
if(i<p) algSzyb1(tab,i,p);
}

Looking at the code, not really checking what it does, just looking at the individual lines, this one line stands out:
while(!(i<j));
I look at that line, and I think: There is a bug somewhere round here. I haven't actually looked at the code so I don't know what the bug is, but I look at this single line and it looks wrong.

I think you need to decrement j before incrementing i.
while (tab[j]>x ) j--;
while (tab[i]<x && i < j) i++;
Also I have added an extra condition to ensure that i doesn't sweep past j. (Uninitialized memory read).
The pivot is slightly mis-named, as the end result is a sorted element, but this and the wikipedia page : quicksort both move the pivot into the higher partition, and don't guarantee the item in the correct place.
The end condition is when you have swept through the list
while( i < j ); /* not !(i<j) */
At the end of the search, you need to test a smaller set. The code you had created a stack overflow, because it repeatedly tried the same test.
if (l<j) algSzyb1(tab, l, j);
if (j+1<p) algSzyb1(tab, j+1, p);
Full code
void algSzyb1(int tab[], int l, int p)
{
int x, w, i, j;
i = l;
j = p;
x = tab[l]; //wróć tu później :D
do
{
while (tab[j]>x ) j--;
while (tab[i]<x && i < j) i++;
if (i < j)
{
w = tab[i];
tab[i] = tab[j];
tab[j] = w;
i++;
j--;
}
} while ((i<j));
if (l<j) algSzyb1(tab, l, j);
if (j+1<p) algSzyb1(tab, j+1, p);
}

Related

C++ vectors Quicksort - Seems to work differently with different pivots

My quicksort algorithm with C++ vectors seem to work fine when I make the pivots as the first, last, or middle element, but not some other values.
I am not sure of all of them, but for example, if I set the pivot as (r-l)/2 it would not sort correctly.
I believe my code is correct, but I am not sure; there might be critical errors.
Is it even possible to sometimes work and sometimes not work, depending on the pivot?
I thought it only affected the running time, so I guess something is wrong with my code.
The following is my code:
#include <vector>
#include <algorithm>
using namespace std;
int choosePivot(int l, int r) {
return (r-l)/2; // or Even r/2
}
int partition(vector<int>& vec, int l, int r) {
int pi = choosePivot(l, r); // pivot index
int pivot = vec[pi];
// swap pivot with the beginning
swap(vec[pi], vec[l]);
// beginning index of the right side of the pivot (larger than the pivot)
int i = l + 1;
// partition around the pivot
for (int j = l+1; j <= r; ++j) {
if (vec[j] <= pivot) {
swap(vec[i], vec[j]);
++i;
}
}
// swap pivot back to its position
swap(vec[l], vec[i - 1]);
// return pivot position
return i - 1;
}
void quicksort(vector<int>& vec, int l, int r) {
if (l < r) {
int p = partition(vec, l, r);
quicksort(vec, l, p - 1);
quicksort(vec, p + 1, r);
}
}
int main() {
ifstream infile("IntegerArray.txt");
int a;
vector<int> vec;
vec.reserve(100000);
while (infile >> a)
vec.push_back(a);
quicksort(vec, 0, vec.size() - 1);
return 0;
}
I added a main function that tests the example.
This is the IntegerArray.txt
It's a file that contains all integers from 1 to 100,000 (no duplicates).
I edited the choosePivot function that it will output a wrongly sorted array.
I don't have a print because the size is too big.
The way quicksort is implemented in the code above, it breaks when the pivot index is not between l and r.
In such case, it starts by bringing in a value from outside the [l, r] segment with swap(vec[pi], vec[l]);.
This can break an already-sorted part of the array.
Now, (r-l)/2 is not always between l and r.
When, for example, l = 10 and r = 20, the pivot index is (20-10)/2 = 5.
So, the code will start sorting the [10, 20] segment by swapping vec[5] and vec[10].
If the part with vec[5] was sorted before [10, 20] segment, this will most likely result in the array not being sorted in the end.

Counting the Number of Element Comparisons in Quick Sort

I have been provided this predefined code for Quick Sort which isn't to be altered much:
I know we already have questions on this but this is different as the logic is predefined here.
void quicksort(int a[], int l, int r)
{
if (r <= l) return;
/* call for partition function that you modify */
quicksort(a, l, i-1);
quicksort(a, i+1, r);
}
int partition(int a[], int l, int r)
{ int i = l-1, j = r; int v = a[r];
for (;;)
{
while (a[++i] < v) ;
while (v < a[--j]) if (j == l) break;
if (i >= j) break;
exch(a[i], a[j]);
}
exch(a[i], a[r]);
return i;
}
We are just required to make slight modification so that the quicksort returns the number of comparisons that it, and partition function together (in total) have performed in sorting the given array a. **In these comparisons, only the comparisons that involve array elements are counted. You are not allowed to use any global variable in counting these comparisons. **
I have implemented this as follows, kindly let me know if I'm mistaken somewhere:
int partition(int a[], int l, int r, int& count) {
int i = l - 1, j = r; int v = a[r];
for (;;) {
while (a[++i] < v) count++;
while (v < a[--j]) {
count++;
if (j == l) break;
}
if (i >= j) break;
swap(a[i], a[j]);
}
swap(a[i], a[r]);
return i;
}
int quickSort(int a[], int l, int r) {
int count = 0;
if (r <= l) return 0;
int i = partition(a, l, r, count);
return count + quickSort(a, l, i - 1) + quickSort(a, i + 1, r);
}
Once you confirm, I'm going to share a surprising result of my research on this with you.
ricis comment works fine as a solution. There is an alternate approach one might take that can be generalized to the std::sort and other algorithms, and that is to make a counting comparer.
Something along the lines of
struct CountingComparer{
CountingComparer():count(0){}
CountingComparer(const CountingComparer& cc):count(cc.count){}
bool operator()(int lhs, int rhs){
count;
return lhs < rhs;
}
size_t count;
};
Now you need to change the signature of your function to add the comparer as a last argument. Like
template<typename COMP>
void quicksort( .... , COMP comp){
And the same change to the partition function. The comparisons are then made by
while (comp(a[++i],v)) and
while (comp(v, a[--j])) respectively.
Calling the sort
You need to make sure you have a reference to your comparer in the template argument.
CountingComparer comp;
quicksort(... , std::ref(comp));
Makes sure comp is not copied.
After the sort you find the number of comparisons in comp.count.
regarding your comment on counts
Your quicksort's behaviour is extensively discussed on the wikipedia page. It is expected that a sorted array behaves badly, while random elements behaves well.
Regarding the cleverness in the partition function
for (;;)
{
while (a[++i] < v) ;
while (v < a[--j]) if (j == l) break;
if (i >= j) break;
exch(a[i], a[j]);
}
exch(a[i], a[r]);
The first for statement isn't really counting anything so thats just a while(true) in disguise. It will end by a break statement.
Find the first large element to swap: The while (a[++i] < v); statement takes advantage of the fact that the pivot `v or a[r]' element is the rightmost element. So the pivot element acts like a guard here.
Find the first small element to swap: The while (v < a[--j]) if (j == l) break; does not have the guarantee of the pivot. Instead it checks for the leftmost limit.
The final check is just to see if the partition is done. If so break out of the infinite loop and finally
swap(a[i], a[r]);, arrange the pivot element to its correct position.

In place randomized selection algorithm

We are currently studying algorithms hence I marked this question as “homework” even though this is not a homework related task. Just to be safe.
We just studied the randomized selection algorithm, and the logic seems simple. Choose an element from a list, and then put the element in its right place. Then repeat the process in one sub list until the element at the index is in its place. Where index is the position of the element you want in the sort list.
This should be a modified version of the quick sort algorithm. But we only sort one sub list, not both sub lists. Hence a performance boost (in big-oh).
I can successfully implement this algorithm using external storage (C++, and zero based array’s):
int r_select2(vector<int>& list, int i)
{
int p = list[0];
vector<int> left, right;
for (int k = 1; k < list.size(); ++k)
{
if (list[k] < p) left.push_back(list[k]);
else right.push_back(list[k]);
}
int j = left.size();
if (j > i) p = r_select2(left, i);
else if (j < i) p = r_select2(right, i - j - 1);
return p;
}
However, I want to implement the algorithm using in-situ (in-place), and not use extra sub arrays. I believe that this should be an easy/trivial task. But somewhere, my in-situ version goes wrong. Maybe it’s just late and I need to sleep, but I can’t see the root cause of why the following version fails:
int r_select(vector<int>& list, int begin, int end, int i)
{
i = i + begin;
int p = list[begin];
if (begin < end)
{
int j = begin;
for (int k = begin + 1; k < end; ++k)
{
if (list[k] < p)
{
++j;
swap(list[j], list[k]);
}
}
swap(list[begin], list[j]);
if (j > i) p = r_select(list, begin, j, i);
else if (j < i) p = r_select(list, j + 1, end, i - j);
}
return p;
}
In both examples, the first element is being used as the pivot to keep the design simple. In both example, i is the index of the element I want.
Any ideas where the 2nd example is failing? Is it a simple off-by-one error?
Thank you all!
This sounds fishy:
i = i + begin;
...
r_select(list, begin, j, i);

What is wrong with my C++ merge sort program?

I'm stuck at an impass with this implementation. My n2 variable is being overwritten during the merging of the subarrays, what could be causing this? I have tried hard-coding values in but it does not seem to work.
#include <iostream>
#include <cstdlib>
#include <ctime> // For time(), time(0) returns the integer number of seconds from the system clock
#include <iomanip>
#include <algorithm>
#include <cmath>//added last nite 3/18/12 1:14am
using namespace std;
int size = 0;
void Merge(int A[], int p, int q, int r)
{
int i,
j,
k,
n1 = q - p + 1,
n2 = r - q;
int L[5], R[5];
for(i = 0; i < n1; i++)
L[i] = A[i];
for(j = 0; j < n2; j++)
R[j] = A[q + j + 1];
for(k = 0, i = 0, j = 0; i < n1 && j < n2; k++)//for(k = p,i = j = 1; k <= r; k++)
{
if(L[i] <= R[j])//if(L[i] <= R[j])
{
A[k] = L[i++];
} else {
A[k] = R[j++];
}
}
}
void Merge_Sort(int A[], int p, int r)
{
if(p < r)
{
int q = 0;
q = (p + r) / 2;
Merge_Sort(A, p, q);
Merge_Sort(A, q+1, r);
Merge(A, p, q, r);
}
}
void main()
{
int p = 1,
A[8];
for (int i = 0;i < 8;i++) {
A[i] = rand();
}
for(int l = 0;l < 8;l++)
{
cout<<A[l]<<" \n";
}
cout<<"Enter the amount you wish to absorb from host array\n\n";
cin>>size;
cout<<"\n";
int r = size; //new addition
Merge_Sort(A, p, size - 1);
for(int kl = 0;kl < size;kl++)
{
cout<<A[kl]<<" \n";
}
}
What tools are you using to compile the program? There are some flags which switch on checks for this sort of thing in e,.g. gcc (e.g. -fmudflap, I haven't used it, but it looks potehtially useful).
If you can use a debugger (e.g. gdb) you should be able to add a 'data watch' for the variable n2, and the debugger will stop the program whenever it detects anything writing into n2. That should help you track down the bug. Or try valgrind.
A simple technique to temporarily stop this type of bug is to put some dummy variables around the one getting trashed, so:
int dummy1[100];
int n2 = r - q;
int dummy2[100];
int L[5], R[5];
Variables being trashed are usually caused by code writing beyond the bounds of arrays.
The culprit is likely R[5] because that is likely the closest. You can look in the dummies to see what is being written, and may be able to deduce from that what is happening.
ANother option is to make all arrays huge, while you track down the problem. Again set values beyond the correct bounds to a known value, and check those values that should be unchanged.
You could make a little macro to do those checks, and drop it in at any convenient place.
I had used the similar Merge function earlier and it doesn't seem to work properly. Then I redesigned and now it works perfectly fine. Below is the redesigned function definition for merge function in C++.
void merge(int a[], int p, int q, int r){
int n1 = q-p+1; //no of elements in first half
int n2 = r-q; //no of elements in second half
int i, j, k;
int * b = new int[n1+n2]; //temporary array to store merged elements
i = p;
j = q+1;
k = 0;
while(i<(p+n1) && j < (q+1+n2)){ //merging the two sorted arrays into one
if( a[i] <= a[j]){
b[k++] = a[i++];
}
else
b[k++] = a[j++];
}
if(i >= (p+n1)) //checking first which sorted array is finished
while(k < (n1+n2)) //and then store the remaining element of other
b[k++] = a[j++]; //array at the end of merged array.
else
while(k < (n1+n2))
b[k++] = a[i++];
for(i = p,j=0;i<= r;){ //store the temporary merged array at appropriate
a[i++] = b[j++]; //location in main array.
}
delete [] b;
}
I hope it helps.
void Merge(int A[], int p, int q, int r)
{
int i,
j,
k,
n1 = q - p + 1,
n2 = r - q;
int L[5], R[5];
for(i = 0; i < n1; i++)
L[i] = A[i];
You only allocate L[5], but the n1 bound you're using is based on inputs q and p -- and the caller is allowed to call the function with values of q and p that allow writing outside the bounds of L[]. This can manifest itself as over-writing any other automatic variables, but because it is undefined behavior, just about anything could happen. (Including security vulnerabilities.)
I do not know what the best approach to fix this is -- I don't understand why you've got fixed-length buffers in Merge(), I haven't read closely enough to discover why -- but you should not access L[i] when i is greater than or equal to 5.
This entire conversation also holds for R[]. And, since *A is passed to Merge(), it'd make sense to ensure that your array accesses for it are also always in bound. (I haven't spotted them going out of bounds, but since this code needs re-working anyway, I'm not sure it's worth my looking for them carefully.)

Using quicksort to sort strings

I am having issues implementing a quicksort to sort an array of strings. I am also relatively new to c++ so still struggling with some issues there. Right now my code correctly reads in and creates an array of strings but I run into problems when I try to use my quicksort algorithm. The first problem I am running into is that I believe that the recursion is not stopping when it should. I get a segmentation fault after the quicksort runs for a little bit.
Code (Modified):
#include "MyParser.h"
#include <iostream>
#include <fstream>
#include <string>
void resize(string*& words, int size)
{
string* newArray = new string[size*2];
for (int i = 0; (i < size)&&(i<size*2); i++)
newArray[i] = words[i];
for (int i = size; i < size*2; i++)
newArray[i] = "";
delete[] words;
words = newArray;
}
void partitionArray(string*& words, int& left, int& right, int pi)
{
int i = left;
int j = right;
string tmp;
string pivot = words[pi];
while (i < j) {
string str1 = words[i];
string str2 = words[j];
while ((str1.compare(pivot) < 0) && (i < right))
i++;
while ((str2.compare(pivot) >= 0) && (j > left))
j--;
if (i <= j)
{
tmp = words[i];
words[i] = words[j];
words[j] = tmp;
i++;
j--;
}
};
}
void quickSort(string*& words, int left, int right)
{
int i = left;
int j = right;
string tmp;
string pivot = words[(left + right) / 2];
/* partition */
int pivotIndex = (left + right) / 2;
pivotIndex = partitionArray(words, 0, right, pivotIndex);
cout << "start recursion" << endl;
/* recursion */
if (left < j)
quickSort(words, left, j);
if (i < right)
quickSort(words, i, right);
}
int main()
{
// define file reader
ofstream outData;
outData.open("logData.txt");
Parser* myParser = new Parser("testData.txt");
int sizeOfArray = 500;
string* words = new string[sizeOfArray];
int index = 0;
while(myParser->hasTokens())
{
if (index >= sizeOfArray)
{
resize(words, sizeOfArray);
sizeOfArray = sizeOfArray*2;
}
string currentWord = myParser->nextToken();
if (currentWord != "")
{
words[index] = currentWord;
index++;
}
}
int lastWordInArrayIndex = index;
quickSort(words, 0, lastWordInArrayIndex);
return 0;
}
Any help on this would be greatly appreciated!
MODIFIED
right now it will sort the following 11 elements correctly:
adfgh
btyui
dfghj
eerty
fqwre
kyuio
verty
wwert
yrtyu
zbsdf
zsdfg
but when attempting to sort the following parsed text, free from all delimiters but worse "like-this" with a single hyphen or words with an apostrophe like "they're", it does not terminate:
Three days after the quarrel, Prince Stepan Arkadyevitch
Oblonsky--Stiva, as he was called in the fashionable world--
woke up at his usual hour, that is, at eight o'clock in the
morning, not in his wife's bedroom, but on the leather-covered
sofa in his study. He turned over his stout, well-cared-for
person on the springy sofa, as though he would sink into a long
sleep again; he vigorously embraced the pillow on the other side
and buried his face in it; but all at once he jumped up, sat up
on the sofa, and opened his eyes.
"Yes, yes, how was it now?" he thought, going over his dream.
"Now, how was it? To be sure! Alabin was giving a dinner at
Darmstadt; no, not Darmstadt, but something American. Yes, but
then, Darmstadt was in America. Yes, Alabin was giving a dinner
on glass tables, and the tables sang, Il mio tesoro--not Il mio
tesoro though, but something better, and there were some sort of
little decanters on the table, and they were women, too," he
remembered.
Again any help with this issue would be greatly appreciated!
Your quickSort function will indeed recurse indefinitely:
void quickSort(string*& words, int left, int right)
{
int i = left;
int j = right;
...
if (left < j)
quickSort(words, left, j);
if (i < right)
quickSort(words, i, right);
}
i, j, left and right are not modified anywhere in that function, so if left < right the function will be called recursively with the same parameters again and again.