Quicksort Partition Algorithm - c++

I am trying to learn Quick Sort. To do so, I followed the logic for quick sort in this article where the last element is picked as the pivot and you work through the array from both ends swapping elements as needed. Now after a long time of trying to come up with my own algorithm based on this, here is what I have so far:
using namespace std;
int a[] = {10,4,3,2,8,5};
int j;
int partition(int left, int right, int array[]){
int pivot = array[right];
while(1){
while(array[left]<pivot){
left = left+1;
}
while(array[right]>pivot){
right=right-1;
}
if(left>=right){
return right;
}
int temp1 = array[left];
int temp2 = array[right];
array[left] = temp2;
array[right] = temp1;
}
}
void quicksort(int left, int right, int array[]){
if(left<right){
int p = partition(left, right, array);
quicksort(left, p-1, array);
quicksort(p+1, right, array);
}
}
int main(){
quicksort(0, sizeof(a)/sizeof(a[0])-1, a);
for(int i=0; i<(sizeof(a)/sizeof(a[0])); i++){
cout << a[i] << endl;
}
}
So after testing this with various arrays of different sizes and elements, it does output a correctly sorted array however in the algorithms that I have found online (ex. Hoare's) they always decrement right and increment left at the start of the partition. Also the pivot does not move in most algorithms I have found, but I move it.
What I am wondering is, am I doing this wrong? Why does it work? It's been a lot of trial and error so that it be understandable if I did it wrong, I am also slightly new to algorithms and data structures so I wouldn't be surprised if I made some mistakes.

Regarding the pivot for Qsort, there's no fixed rule for selecting the pivot. Choosing the pivot really depends on which strategy you want to impose on your input data, and that largely depends on the properties of the input data - range, size etc.
Generally, we choose the pivot so that it splits the array into two equal halves (as much as possible). This is done so that we don't end up pushing all the elements on to one-side of the Qsort. If this happens your algorithm will become O(n^2) - think of skew trees if you're not able to see why.

Related

i am trying to implement the double pivot partition function in reference to quicksort. Can someone help me?

hi I'm trying to implement this double pivot function and then pass the found indices to the recursive quicksort function with three calls (arr, left, pivot_sx) / (arr, pivot_sx + 1, pivot_dx-1) / (arr, pivot_dx + 1, right). I know that the function is still incomplete and that I do not return the indexes most likely I will declare a pointer to which I will assign the value of the left index and the right index will simply return it with a return.
the code I wrote for now is this:
#include <iostream>
#include<time.h>
#include <utility>
using namespace std;
int double_partition(int a[],int left,int right){
if(a[left]>a[right]){std::swap(a[left],a[right]);}
int piv_sx=a[left];
int piv_dx=a[right];
int i_piv_sx=left;
int i_piv_dx=right;
int i=left+1;
int j=right-1;
while(i<=j){
if(a[i]<=piv_sx){
std::swap(a[i],a[i_piv_sx]);
i_piv_sx+=1;
}else if(a[i]>=piv_dx){
std::swap(a[i],a[i_piv_dx]);
i_piv_dx-=1;
}
i++;
}
return 0;
}
void printArray(int arr[], int size){
int i;
for (i=0; i < size; i++)
cout<<" "<<arr[i];
cout<<std::endl;
}
int main()
{
srand(time(NULL));
int n;
cout << "Enter the number of items:" << "\n";
cin >>n;
int *arr = new int(n);
for (int x = 0; x < n; x++) {
int num = rand() % 10000;
arr[x]=num;
}
printArray(arr,n);
double_partition(arr,0,n-1);
printArray(arr,n);
}
as we know the pivot on the left must be smaller than the one on the right so at the beginning I compare the two pivots and if the one on the left is greater then I make the exchange.
After doing this I save the values of the pivots in two variables which I will then need to make the various comparisons.
and then I do this while loop where I start i from left + 1 because I don't need to compare the first element because I already know the value since it corresponds to the left pivot so it would be a useless comparison same reasoning for j = right-1.
I do all the comparisons and in the end when I compile and run the code sometimes I find a correct result and sometimes I find an incorrect result.
Is there anything I'm not doing right? Could anyone help me?
I should find the array in the situation where the elements on the left are smaller than the left pivot, the elements in the center are greater than the left pivot but smaller than the right pivot and the elements on the right will be greater than the right pivot.
int *arr = new int(n); gives you 1 int, initialized with the value n.
You want int *arr = new int[n];, which gives you an array of n ints.
Also don't forget to delete the array with delete[] arr; when you do not need it anymore.

What is wrong with this merge sort algorithm?

Can someone please tell what is wrong with this piece of code? This code is to sort a set of elements in an array using merge sort.
#include<iostream>
void merge(int arr[], int left, int mid, int right){
int left_ptr = left;
int right_ptr = mid + 1;
int size = right - left + 1;
int temp[size];
int k = left;
while (left_ptr <= mid && right_ptr <= right)
{
if(arr[left_ptr] <= arr[right_ptr]){
temp[k] = arr[left_ptr];
left_ptr++;
k++;
}
else{
temp[k] = arr[right_ptr];
right_ptr++;
k++;
}
}
while (left_ptr <= mid)
{
temp[k] = arr[left_ptr];
left_ptr++;
k++;
}
while (right_ptr <= right)
{
temp[k] = arr[right_ptr];
right_ptr++;
k++;
}
for (int i = left_ptr; i < k; i++)
{
arr[i] = temp[i];
}
}
void mergeSort(int arr[], int left, int right){
int mid;
if (left < right)
{
mid = (right + left)/2;
mergeSort(arr, left, mid);
mergeSort(arr, mid + 1, right);
merge(arr, left, mid, right);
}
}
int main(){
int arr[] = {45,8,9,7,4,58,2,34,2,58};
std::cout << arr << std::endl;
int size = sizeof(arr)/sizeof(int);
mergeSort(arr, 0, size - 1);
for (int i = 0; i < size; i++)
{
std::cout << arr[i] << " ";
}
std::cout << std::endl;
}
I double checked it with many online codes and I see no error... What do you think went wrong? I tried to implement this using an array in-place (something like quick-sort).
Here is a list of things wrong with your code. I'm taking "wrong" broadly here. For each bit of code, my primary criticism is style based not "correctness", where the style aimed for is one that makes correctness easier to spot.
Along the way, one of the style criticisms results in spotting what looks like a bug.
void merge(int arr[], int left, int mid, int right){
You are using int to refer to offsets in an array.
You are using int[] parameters, which is a legacy C syntax for int* arr. Use something like std::span instead.
Going on:
int left_ptr = left;
If your goal is to preserve the original arguments and work on copies, make the original arguments const so someone doesn't have to prove they aren't mutated in the body of the function.
int right_ptr = mid + 1;
You have variables called _ptr that aren't pointers.
int size = right - left + 1;
you appear to not be using half-open intervals. Use and learn to use half-open intervals. They are conventional in C++ and they really do get rid of lots of fence-post correcting code.
int temp[size];
This is not compliant C++. Practically, even on compilers that support this, many C++ implementations have much smaller stacks than the memory of arrays you might want to sort. This then results in your code blowing its stack.
Correctness is more important than performance. Creating dynamically sized objects on the stack leads to programs that engage in undefined behavior or crash on otherwise reasonable inputs.
int k = left;
this variable does not describe what it does.
while (left_ptr <= mid && right_ptr <= right)
while (left_ptr <= mid)
while (right_ptr <= right)
there is a lot of code duplication in these loops.
DRY - don't repeat yourself. Here, if there is a bug in any one of the repeats, if you DRY the bug would be in all uses and easier to spot. There are a lot of ways to DRY here (lambdas, helper functions, slightly more complex branching and one loop); use one of them.
for (int i = left_ptr; i < k; i++)
{
arr[i] = temp[i];
}
looks like a manual std copy? Also looks like it has a bug, because of course manually reimplementing std copy means you did it wrong.
void mergeSort(int arr[], int left, int right){
again, legacy C-style array passing.
int mid;
No need to declare this without initializing it. Move declarations as close as possible to their point of first use, and have them fall out of scope as soon as possible.
if (left < right)
{
mid = (right + left)/2;
make this int mid =.
mergeSort(arr, left, mid);
mergeSort(arr, mid + 1, right);
An example of how closed intervals make you have to do annoying fenceposting.
merge(arr, left, mid, right);
mergeSort(arr, 0, size - 1);
another fencepost +/- 1 here.
I see two possible errors in the code:
Declaring int temp[size]; in merge is not valid, as size is not a constant. You will need to allocate the array dynamically.
Secondly, in the last segment of the merge function (the for loop), you initialize i = left_ptr. However, left_ptr is set equal to mid before that. I believe you actually want to initialize i = left.
EDIT: Just noticed: temp does not necessarily have to start at the beginning of arr. What I mean is, that each element of temp is mapped to a specific element of arr, but your code assumes at several plases, that temp[0] is mapped to arr[0], which is not true (temp[0] is actually mapped to arr[right]). There are two ways to fix that.
You can fix the pieces that are based on this assumption. Instead of arr[i] = temp[i], use arr[i + right] = temp[i] in the final for loop, and initialize k as zero.
Second option is to, instead of creating and deleting temp in every single call to merge, create it to be equal in size to arr and hold onto it for the entirety of execution of the algorithm (that could be done by creating it outside the algorithm and passing it to each call to merge or MergeSort) That way, the equal offset assumption would actually be correct.

C++ sorting function for scheduling program

I have an assignment where I am given a list of events with a start time and end time and I need to make it schedule as many events as possible for that day assuming there is only one room to use. So the events are sorted by end time. The sorting algorithm I am supposed to implement is as follows:
sort() — a function to sort a floats array data[], creating an array of sorted indices. The sort() function does not sort the data, but fills the the array indx[] so that
data[indx[0]], data[indx[1]], ..., data[indx[NUM_EVENTS - 1]]
are the values of data[] in ascending order.
I am a little bit confused on what exactly it is asking but anyways this is what I have so far:
void sort(float data[], int indx[], int len){
int temp;
for (int i = 0; i < len; i++){
for (int j = 0; j < len; j++){
if (data[j] > data[j+1]){
temp = data[j];
indx[j] = data[j+1];
indx[j+1] = temp;
}
}
}
}
This code compiles but doesnt behave as it should. When I try to print what is in indx[] I get strange results. Any help is greatly appreciated. Thanks!
You are reading uninitialized memory. You are copying elements from data to indx, but only when data[j] > data[j + 1]. When that isn't true twice in a row, you have an element of indx that isn't assigned a value. It has an indeterminate value of random bits left by whatever used the memory before, and reading it is undefined behavior.

How do I drop the lowest value?

I'm pretty new to C++, and I need help figuring out the code for dropping the lowest value of a randomly generated set of numbers. Here is my code so far:
//Create array and populate the array with scores between 55 and 10
// Drop lowest Score
#include <iostream>
#include <cstdlib>//for generating a random number
#include <ctime>
#include <iomanip>
#include <algorithm>
#include <vector>
using namespace std;
//function prototype
int *random (int);
int main()
{ int *numbers; //point to numbers
//get an array of 20 values
numbers = random(20);
//display numbers
for (int count = 0; count < 20; count++)
cout << numbers[count] << endl;
cout << endl;
system("pause");
return 0;
}
//random function, generates random numbers between 55 and 100 ??
int *random(int num)
{ int *arr; //array to hold numbers
//return null if zero or negative
if (num <= 0)
return NULL;
//allocate array
arr = new int[num];
//seed random number generator
srand(time (0));
//populate array
for (int count = 0; count < num; count++)
arr[count] = (rand()%(45) +55);
//return pointer
//
return arr;
}
For this piece of code, how would I sort or find the lowest score to drop it after the function returns the random numbers?
int main()
{ int *numbers; //point to numbers
//get an array of 20 values
numbers = random(20);
//display numbers
for (int count = 0; count < 20; count++)
cout << numbers[count] << endl;
cout << endl;
system("pause");
return 0;
}
Your suggestions are appreciated!
In general, to find the lowest value in an array, you can follow this psuedo-algorithm:
min = array[0] // first element in array
for (all_values_in_array)
{
if (current_element < min)
min = current_element
}
However, you can't "drop" a value out of a static array. You could look into using a dynamic container (eg. vector), or swapping the lowest value with the last value, and pretending the size of the array is 1 less. Another low level option would be to create your own dynamic array on the heap, however, this is probably more complicated than you are looking for.
Using an vector would be much easier. To drop the lowest element, you just have to sort in reverse order, then remove the last element. Personally, I would recommend using a vector.
The obvious approach to find the smallest element is to use std::min_element(). You probably want to use std::vector<T> to hold your elements but this isn't absolutely necessary. You can remove the smallest value from an array like this:
if (count) {
int* it = std::min_element(array, array + count);
std::copy(it + 1, array + count--, it);
}
Assuming you, reasonable used std::vector<int> instead, the code would look something like this:
if (!array.empty()) {
array.erase(std::min_element(array.begin(), array.end()));
}
First find the index of the lowest number:
int lowest_index=0, i;
for (i=0; i<20; i++)
if (arr[i]<arr[lowest_index])
lowest_index=i;
Now that we know the index, move the numbers coming after that index to overwrite the index we found. The number of numbers to move will be 19 minus the found index. Ie, if index 2 (the third number, since the first is at index 0) is lowest, then 17 numbers comes after that index, so that's how many we need to move.
memcpy(&arr[lowest_index],&arr[lowest_index+1],sizeof(int)*(19-lowest_index))
Good luck!
Sort the array ascending.
The lowest value will be at the beginning of the array.
Or sort the array descending and remove the last element.
Further to what others have said, you may also choose to use something like, perhaps a std::list. It's got sorting built-in, also offering the ability to define your own compare function for two elements. (Though for ints, this is not necessary)
First, I typically typedef the vector or list with the type of the elements it will contain. Next, for lists I typedef an iterator - though both of these are merely a convenience, neither is necessary.
Once you've got a list that will holds ints, just add them to it. Habit and no need to do otherwise means I'll use .push_back to add each new element. Once done, I'll sort the list, grab the element with the lowest value (also the lowest 'index' - the first item), then finally, I'll remove that item.
Some code to muse over:
#include <cstdio>
#include <cstdlib>
#include <list>
using namespace std;
typedef list<int> listInt;
typedef listInt::iterator listIntIter;
bool sortAsc(int first, int second)
{
return first < second;
}
bool sortDesc(int first, int second)
{
return first > second;
}
int main (void)
{
listInt mList;
listIntIter mIter;
int i, curVal, lowestScore;
for (i=1; i<=20; i++)
{
curVal = rand()%45 + 55;
mList.push_back(curVal);
printf("%2d. %d\n", i, curVal);
}
printf("\n");
mList.sort();
// mList.sort(sortAsc); // in this example, this has the same effect as the above line.
// mList.sort(sortDesc);
i = 0;
for (mIter=mList.begin(); mIter!=mList.end(); mIter++)
printf("%2d. %d\n", ++i, *mIter);
printf("\n");
lowestScore = mList.front();
mList.pop_front();
printf("Lowest score: %d\n", lowestScore);
return 0;
}
Oh, and the choice to use printf rather than cout was deliberate too. For a couple of reasons.
Personal preference - I find it easier to type printf("%d\n", someVar);
than cout << someVar << endl;
Size - built with gcc under windows, the release-mode exe of this example is 21kb.
Using cout, it leaps to 459kb - for the same functionality! A 20x size increase for no gain? No thanks!!
Here's an std::list reference: http://www.cplusplus.com/reference/stl/list/
In my opinion the most optimal solution to your problem would be to use a linked list to store the numbers, this way you can use an algorithm with complexity O(N) = N to find the smallest element in the list, it is a similar finding method given by user1599559 or Mikael Lindqvist, you only need stored together with the minimum value the pointer to the Item(ItemX) in the linked list that store it, then to eliminate Item X just tell Item X - 1 points to Item X + 1 and free memory allocated by Item X

c++ quick sort running time

I have a question about quick sort algorithm. I implement quick sort algorithm and play it.
The elements in initial unsorted array are random numbers chosen from certain range.
I find the range of random number effects the running time. For example, the running time for 1, 000, 000 random number chosen from the range (1 - 2000) takes 40 seconds. While it takes 9 seconds if the 1,000,000 number chosen from the range (1 - 10,000).
But I do not know how to explain it. In class, we talk about the pivot value can effect the depth of recursion tree.
For my implementation, the last value of the array is chosen as pivot value. I do not use randomized scheme to select pivot value.
int partition( vector<int> &vec, int p, int r) {
int x = vec[r];
int i = (p-1);
int j = p;
while(1) {
if (vec[j] <= x){
i = (i+1);
int temp = vec[j];
vec[j] = vec[i];
vec[i] = temp;
}
j=j+1;
if (j==r)
break;
}
int temp = vec[i+1];
vec[i+1] = vec[r];
vec[r] = temp;
return i+1;
}
void quicksort ( vector<int> &vec, int p, int r) {
if (p<r){
int q = partition(vec, p, r);
quicksort(vec, p, q-1);
quicksort(vec, q+1, r);
}
}
void random_generator(int num, int * array) {
srand((unsigned)time(0));
int random_integer;
for(int index=0; index< num; index++){
random_integer = (rand()%10000)+1;
*(array+index) = random_integer;
}
}
int main() {
int array_size = 1000000;
int input_array[array_size];
random_generator(array_size, input_array);
vector<int> vec(input_array, input_array+array_size);
clock_t t1, t2;
t1 = clock();
quicksort(vec, 0, (array_size - 1)); // call quick sort
int length = vec.size();
t2 = clock();
float diff = ((float)t2 - (float)t1);
cout << diff << endl;
cout << diff/CLOCKS_PER_SEC <<endl;
}
Most likely it's not performing well because quicksort doesn't handle lots of duplicates very well and may still result in swapping them (order of key-equal elements isn't guaranteed to be preserved). You'll notice that the number of duplicates per number is 100 for 10000 or 500 for 2000, while the time factor is also approximately a factor of 5.
Have you averaged the runtimes over at least 5-10 runs at each size to give it a fair shot of getting a good starting pivot?
As a comparison have you checked to see how std::sort and std::stable_sort also perform on the same data sets?
Finally for this distribution of data (unless this is a quicksort exercise) I think counting sort would be much better - 40K memory to store the counts and it runs in O(n).
It probably has to do with how well sorted the input is. Quicksort is O(n logn) if the input is reasonably random. If it's in reverse order, performance can degrade to O(n^2). You're probably getting closer to the O(n^2) behavior with the smaller data range.
Late answer - the effect of duplicates depends on the partition scheme. The example code in the question is a variation of Lomuto partition scheme, which takes more time as the number of duplicates increases, due to the partitioning getting worse. In the case of all equal elements, Lomuto only reduces the size by 1 element with each level of recursion.
If instead Hoare partition scheme was used (with middle value as pivot), it generally takes less time as the number of duplicates increases. Hoare will needlessly swap values equal to the pivot, due to duplicates, but the partitioning will approach the ideal case of splitting an array in nearly equally sized parts. The swap overhead is somewhat masked by memory cache. Link to Wiki example of Hoare partition scheme:
https://en.wikipedia.org/wiki/Quicksort#Hoare_partition_scheme