Fork Join not sorting correctly - concurrency

I'm trying to do an adaptation of the merge sort using a fork join.
I'm using this as a fork/join example so I need to keep it basic.
I want to edit the regular merge sort so that when the segment size goes below 101 (100 or less) it will use an insertion sort to sort that segment and then come out of the recursive call and start merging the segments back together.
The sort is just simply not working. If I change the order of the invoke and join(to stop other code from running) it works fine so I'm assuming it is because my sorts are running concurrently that it is not correct.
For example if I write
int mid = (lb+ub)/2;
MergeInsertSortQ left = new MergeInsertSortQ(f,lb,mid);
MergeInsertSortQ right = new MergeInsertSortQ(f,mid,ub);
left.fork(); left.join();
right.fork(); right.join();
merge(f,lb,mid,ub);
It sorts fine, but this is esentially sequential so is not really what I'm trying to do.
Here is the code I am using (including a little test in main)
import java.util.concurrent.*;
public class MergeInsertSortQ extends RecursiveAction{
private int[] f;
private int lb, ub;
public MergeInsertSortQ(int f[], int lb, int ub)
{
this.f = f;
this.lb=lb;
this.ub=ub;
}
protected void compute(){
//Insertion Sort performed when a segment of size
//100 or less is reached otherwise do merge sort
if(ub-lb>100)
{
int mid = (lb+ub)/2;
MergeInsertSortQ left = new MergeInsertSortQ(f,lb,mid);
MergeInsertSortQ right = new MergeInsertSortQ(f,mid,ub);
invokeAll(left,right);
left.join();
right.join();
merge(f,lb,mid,ub);
}
else if(ub-lb>=1)
{
for (int i = lb; i<ub;i++)
{
int temp = f[i];
int j = i-1;
while (j >= 0 && f[j] > temp)
{
f[j+1] = f[j];
j = j-1;
}
f[j+1] = temp;
}
}
}
protected void merge(int f[], int lower, int middle, int top){
int i = lower; int j = middle;
//use temp array to store merged sub-sequence
int temp[] = new int[top-lower]; int t = 0;
while(i < middle && j < top)
{
if(f[i] <= f[j])
{
temp[t]=f[i];i++;t++;
}
else{
temp[t] = f[j]; j++; t++;
}
}
//tag on remaining sequence
while(i < middle)
{
temp[t]=f[i]; i++; t++;
}
while(j < top)
{
temp[t] = f[j]; j++; t++;
}
//copy temp back to f
i = lower; t = 0;
while(t < temp.length)
{
f[i] = temp[t]; i++; t++;
}
}
public static void main(String[] args)
{
//Initialise & print array before sorting
int A[]=new int[200];
for(int j=0;j<A.length;j++)
{
A[j]=(int)(Math.random()*10000);
System.out.print(A[j]+" ");
}
System.out.println();
System.out.println("**********************************");
System.out.println();
// Do the sort
ForkJoinPool fjPool=new ForkJoinPool();
fjPool.invoke(new MergeInsertSortQ(A,0,A.length));
//Check if array sorted and print
boolean sorted=true;
for(int i=0;i<A.length-1;i++)
{
System.out.print(A[i] + " ");
if (A[i]>A[i+1])
{
System.out.println();
System.out.println(A[i]+" is greater than "+A[i+1]+", location "+i);
sorted=false;
//break;
}
}
System.out.println();
if (sorted) System.out.println("The array is sorted!!!");
else System.out.println("WARNING: The array is NOT SORTED");
}
}

Related

Getting different output on each execution

I tried to implement number of inversions in an array, using merge sort.
Every time I execute this code, I get different value of the number of inversions. I am not able to figure out the reason for this. Please have a look at the code and tell me the mistake.
#include<stdio.h>
#include<iostream>
using namespace std;
int count =0;
void merge(int A[],int start,int mid,int end)
{
int size1 = mid-start+1;
int size2 = end-(mid+1)+1;
int P[size1];
int Q[size2];
for(int i=0;i<size1;i++)
P[i]=A[start+i];
for(int j=0;j<size2;j++)
Q[j]=A[mid+j+1];
int k = 0;
int l = 0;
int i =0;
while(k<mid && l<end)
{
if(P[k]>Q[l])
{
A[i] = Q[l];
l++; i++;
count++;
}
else
{
A[i] = P[k];
k++; i++;
}
}
}
void inversions(int A[],int start,int end)
{
if(start!=end)
{
int mid = (start+end)/2;
inversions(A,start,mid);
inversions(A,mid+1,end);
merge(A,start,mid,end);
}
}
int main()
{
int arr[] = {4,3,1,2,7,5,8};
int n = (sizeof(arr) / sizeof(int));
inversions(arr,0,n-1);
cout<<"The number of inversions is:: "<<count<<endl;
return 0;
}
int k = 0;
int l = 0;
int i =0;
while(k<mid && l<end)
{
if(P[k]>Q[l])
{
A[i] = Q[l];
l++; i++;
count++;
}
else
{
A[i] = P[k];
k++; i++;
}
}
Few mistakes here, i starts from start and not 0. k must loop from 0 till size1 and not till mid. Similarly, l must loop from 0 till size2 and not till end. You are incrementing count by 1 when P[k] > Q[l] but this is incorrect. Notice that all the elements in array P following the element P[k] are greater than Q[l]. Hence they also will form an inverted pair. So you should increment count by size1-k.
Also, the merge procedure should not only count the inversions but also merge the two sorted sequences P and Q into A. The first while loop while(k<size1 && l<size2) will break when either k equals size1 or when l equals size2. Therefore you must make sure to copy the rest of the other sequence as it is back into A.
I have made the appropriate changes in merge and pasted it below.
void merge(int A[],int start,int mid,int end)
{
int size1 = mid-start+1;
int size2 = end-(mid+1)+1;
int P[size1];
int Q[size2];
for(int i=0;i<size1;i++)
P[i]=A[start+i];
for(int j=0;j<size2;j++)
Q[j]=A[mid+j+1];
int k = 0;
int l = 0;
int i = start;
while(k<size1 && l<size2)
{
if(P[k]>Q[l])
{
A[i] = Q[l];
l++; i++;
count += size1-k;
}
else
{
A[i] = P[k];
k++; i++;
}
}
while (k < size1)
{
A[i] = P[k];
++i, ++k;
}
while (l < size2)
{
A[i] = Q[l];
++i, ++l;
}
}
int P[size1];
int Q[size2];
VLA (Variable length arrays) are not supported by C++. size1 and size2 are unknown during compile time. So, each time they get a different value and hence the difference in output.
Use std::vector instead
std::vector<int> P(size1, 0); //initialize with size1 size
std::vector<int> Q(size2, 0); //initialize with size2 size

Quick Sort program stopped working

I was trying to solve the quick sort - 2 challenge on hackerrank. It said that we had to repeatedly call partition till the entire array was sorted. My program works for some test cases but for some it crashes, "Quick Sort - 2.exe has stopped working". I couldn't find the reason as to why it's happening.
The first element of the array/sub-array was to be taken as pivot element each time.
#include <iostream>
#include <conio.h>
using namespace std;
void swap(int arr[], int a, int b)
{
int c = arr[a];
arr[a] = arr[b];
arr[b] = c;
}
void qsort(int arr[], int m, int n) //m - lower limit, n - upper limit
{
if (n - m == 1)
{
return;
}
int p = arr[m], i, j, t; //p - pivot element, t - temporary
//partition
for (int i = m+1; i < n; i++)
{
j = i;
if (arr[j] < p)
{
t = arr[j];
while (arr[j] != p)
{
arr[j] = arr[j-1];
j--;
}
arr[j] = t; //pivot is at j and j+1
}
}
//check if sorted
int f = 1;
while (arr[f] > arr[f-1])
{
if (f == n-1)
{
f = -1;
break;
}
f++;
}
if (f == -1)
{
cout << "Sub Array Sorted\n";
}
else
{
if (p == arr[m]) //pivot is the smallest in sub array
{
qsort(arr, m+1, n); //sort right sub array
}
else
{
qsort(arr, m, j+1); //sort left sub array
qsort(arr, j+1, n); //sort right sub array
}
}
}
int main()
{
int n;
cin >> n;
int arr[n];
for (int i = 0; i < n; i++)
{
cin >> arr[i];
}
qsort(arr, 0, n);
for (int i = 0; i < n; i++)
{
cout << arr[i] << " ";
}
return 0;
}
You have an index out of range problem.
This will not give you the solution, but it may help you to find the reason why your program fails.
I have modified your program so it uses a vector of int rather than a raw array of int, and when you run this program you get an index out of range exception.
The sequence 4 3 7 1 6 4 that triggers the problem is hardcoded, so you don't need to type it each time.
#include <iostream>
#include <vector>
using namespace std;
void swap(vector<int> & arr, int a, int b)
{
int c = arr[a];
arr[a] = arr[b];
arr[b] = c;
}
void qsort(vector<int> & arr, int m, int n) //m - lower limit, n - upper limit
{
if (n - m == 1)
{
return;
}
int p = arr[m], j, t; //p - pivot element, t - temporary
//partition
for (int i = m + 1; i < n; i++)
{
j = i;
if (arr[j] < p)
{
t = arr[j];
while (arr[j] != p)
{
arr[j] = arr[j - 1];
j--;
}
arr[j] = t; //pivot is at j and j+1
}
}
//check if sorted
int f = 1;
while (arr[f] > arr[f - 1])
{
if (f == n - 1)
{
f = -1;
break;
}
f++;
}
if (f == -1)
{
cout << "Sub Array Sorted\n";
}
else
{
if (p == arr[m]) //pivot is the smallest in sub array
{
qsort(arr, m + 1, n); //sort right sub array
}
else
{
qsort(arr, m, j + 1); //sort left sub array
qsort(arr, j + 1, n); //sort right sub array
}
}
}
int main()
{
vector<int> arr = { 4,3,7,1,6,4 };
qsort(arr, 0, arr.size());
for (unsigned int i = 0; i < arr.size(); i++)
{
cout << arr[i] << " ";
}
return 0;
}
First of all, what you made is not quick sort, but some combination of divide-ans-conquer partitioning and insert sort.
Canonical quicksort goes from from lower (p) and upper (q) bounds of array, skipping elements arr[p]m respectively. Then it swaps arr[p] with arr[q], increments/decrements and checks if p>=q. Rinse and repeat until p>=q. Then make calls on sub-partitions. This way p or q holds pivot position and subcalls are obvious.
But you are doing it different way: you insert elements from right side of subarray to left side. Such thing can produce O(N^2) time complexity for one iteration. Consider 1,0,1,0,1,0,1,0,1,0,1,0,... sequence, for example. This can increase worst case complexity over O(N^2).
Out of time complexity... The problem in your function lies in assumption that j holds pivot location in subcalls:
qsort(arr, m, j+1); //sort left sub array
qsort(arr, j+1, n); //sort right sub array
Actually, j is set again and again equal to i in your main for loop. If last element is equal or greater than pivot, you end up with j=n-1, the you call qsort(arr, n, n) and first lines check is passed (sic!), because n-n != 1.
To fix this you should do two things:
1) find pivot location directly after rearrange:
for (int i = m; i < n; i++)
if (p == arr[i])
{
j = i;
break;
}
or initialize it in different variable, update after this line:
arr[j] = t; //pivot is at j and j+1
and update recursive calls to use new variable instead of j
2) make a more bulletproof check in the beginning of your function:
if (n - m <= 1)
the latter will be enough to get some result, but it will be much less effective than your current idea, falling down to probably O(N^3) in worst case.

Inserting value into sorted array without duplicates: C++

For this program I have three data files. The first has a list of numbers, the second is a list of numbers with an add (A) or delete (D) command. I have to put the numbers from the first file into the third file, then update the final file based on the commands and numbers in the second file. The third file cant have duplicates and must be sorted while values are being inserted. Here are the functions I have, I'm having difficulty getting the items stored into the array without duplicates. The array must be statically sized, I did a #define of max size 2000 which is more than enough to handle the numbers I need. Thanks so much! If I should upload the main let me know, but I'm fairly certain the problem lies in one of these functions.
int search(int value, int list[], int n) // returns index, n is logical size of array
{
int index = -1;
for(int i = 0; i < n; i++)
{
if(value == list[i])
{
index = i;
return index;
}
}
return index;
}
void storeValue(int value, int list[], int& n)
{
int i = n;
for(; i > 0 && list[i - 1] < value; i--)
{
list[i] = list[i - 1];
}
list[i] = value;
n++;
}
void deleteValue(int loc, int list[], int n)
{
if(loc >= 0 && loc < n)
{
for(int i = loc; i < n - 1; i++)
list[i] = list[i +1];
n--;
}
}
UPDATE: Now duplicates are being stored, but only for some numbers.
For example: my 3rd file is: 1,2,8,8,9,101,101,104,etc.
The output should be: 1,2,8,9,101,104,etc
value: value to be inserted
list[]: array being modified (must be static)
n: logical size of array
I can't figure out why some numbers are duplicated and others aren't
In my main, I run the search function and if a -1 is returned (the value isn't already found) then I run the storeValue function.
Here are my updated functions:
int search(int value, int list[], int n) // returns index
{
int index = -1;
for(int i = 0; i < n; i++)
{
if(value == list[i])
{
index = i;
return index;
}
}
return index;
}
void storeValue(int value, int list[], int& n)
{
int i = n;
for(; i > 0 && list[i - 1] > value; i--)
{
list[i] = list[i - 1];
}
list[i] = value;
n++;
}
void deleteValue(int loc, int list[], int& n)
{
if(loc >= 0 && loc < n)
{
for(int i = loc; i < n; i++)
{
if (i == loc)
{
list[i] = list[i + 1];
i++;
}
}
n--;
}
}
In your deleteValue function, you are deleting the value, but will end up with a duplicate because you are just reassigning the current index to the next value. For example, if you have the array:
char array[3] = [1, 2, 3];
and you wanted to delete the second integer, your function currently would output this:
[1, 3, 3]
What you want to do is make a new array and loop through your entire list, making sure to leave out the last element like so:
char* deleteValue(int loc, int list[], int n)
{
char* newArray[n - 1];
if (loc >= 0 && loc < n)
{
for (int i = 0; i < n - 1; i++)
{
if (i == loc) // You have arrived at the element that needs to be deleted
{
newArray[i] = list[i + 1];
i++; // So we skip over the deleted element
}
else
newArray[i] = list[i];
}
}
return newArray;
}
And this should take care of the case where the last value is duplicated.

How to use selection sort algorithm correctly to sort a list?

I cannot get this to work, seems like whatever I do it never sorts correctly.
I am trying to sort in a descending order based on number of points.
Bryan_Bickell 2 5 +2
Brandon_Bolig 0 3 0
Dave_Bolland 4 2 -1
Sheldon_Brookbank 0 4 -1
Daniel_Carcillo 0 1 +3
The middle column is the amount of points.
I am using 4 arrays to store all of those values, how would I correctly utilize the array selection sort to get it to order in the right way?
I had tried all the answers below but none of them seemed to work, this is what i have so far
void sortArrays( string playerNames[], int goals[], int assists[], int rating[], int numPlayers )
{
int temp, imin;
int points[numPlayers];
for(int j = 0; j < numPlayers; j++)
{
points[j] = goals[j] + assists[j];
}
imin = points[0];
for(int i = 0; i < numPlayers; i++)
{
if (points[i] < imin)
{
imin = points[i];
}
}
for(int j = 1; j < numPlayers; j++)
{
if (points[j] > imin)
{
temp = points[j];
points[j] = points[j-1];
points[j-1] = temp;
}
}
}
it should go like this...
void selsort(int *a,int size)
{
int i,j,imin,temp;
//cnt++;
for(j=0;j<size;j++)
{
//cnt+=2;
imin=j;
for(i=j+1;i<size;i++)
{
//cnt+=2;
if(a[i]<a[imin])
{
//cnt++;
imin=i;
}
}
if(imin!=j)
{
//cnt+=3;
temp=a[j];
a[j]=a[imin];
a[imin]=temp;
}
}
}
You don't need 4 arrays to store those records if only the middle column is used for sorting, i.e, keys used for sorting the records. From my understanding, you are trying to sort those records of people based on the number of points with selection sort. Code should look like the following: assuming records is your array of records
void selectionSort(RECORD records[], int n) {
int i, j, minIndex, tmp;
for (i = 0; i < n - 1; i++) {
maxIndex = i;
for (j = i + 1; j < n; j++) //find the current max
{
if (records[j].point > records[minIndex].point)
{
//assume point is the number of point, middle column
minIndex = j;
}
}
//put current max point record at correct position
if (minIndex != i) {
tmp = records[i];
records[i] = records[minIndex];
records[minIndex] = tmp;
}
}
}
It will sort all your records in "descending order" as you want
how about store the data into a std::vector then sort it
int compare(int a, int b){
return (a>b);
}
void sort(std::vector<int> &data){
std::sort(data.begin(), data.end(), compare);
}
try to use vector as much possible, they have been heavy optimized for performance and better memory usage

When does merge_sort beat quick_sort?

SO Posts
When to use merge sort and when to use quick sort?
Quick Sort Vs Merge Sort
Wikipedia
http://en.wikipedia.org/wiki/Merge_sort
http://en.wikipedia.org/wiki/Quicksort
quick_sort is suppose to have worst case O(n^2) but merge_sort is suppose to not have a worst case and always be O (n*log N). I thought that it was dependent upon the ordering of the data set - reverse order, forward order, or random, but when I a run test...quick_sort is always faster. The code I used is below:
/*
Needs a reszie function added
*/
#include "c_arclib.cpp"
template <class T> class dynamic_array
{
private:
T* array;
T* scratch;
public:
int size;
dynamic_array(int sizein)
{
size=sizein;
array = new T[size]();
}
void print_array()
{
for (int i = 0; i < size; i++) cout << array[i] << endl;
}
void merge_recurse(int left, int right)
{
if(right == left + 1)
{
return;
}
else
{
int i = 0;
int length = right - left;
int midpoint_distance = length/2;
int l = left, r = left + midpoint_distance;
merge_recurse(left, left + midpoint_distance);
merge_recurse(left + midpoint_distance, right);
for(i = 0; i < length; i++)
{
if((l < (left + midpoint_distance)) && (r == right || array[l] > array[r]))
{
scratch[i] = array[l];
l++;
}
else
{
scratch[i] = array[r];
r++;
}
}
for(i = left; i < right; i++)
{
array[i] = scratch[i - left];
}
}
}
int merge_sort()
{
scratch = new T[size]();
if(scratch != NULL)
{
merge_recurse(0, size);
return 1;
}
else
{
return 0;
}
}
void quick_recurse(int left, int right)
{
int l = left, r = right, tmp;
int pivot = array[(left + right) / 2];
while (l <= r)
{
while (array[l] < pivot)l++;
while (array[r] > pivot)r--;
if (l <= r)
{
tmp = array[l];
array[l] = array[r];
array[r] = tmp;
l++;
r--;
}
}
if (left < r)quick_recurse(left, r);
if (l < right)quick_recurse(l, right);
}
void quick_sort()
{
quick_recurse(0,size);
}
void rand_to_array()
{
srand(time(NULL));
int* k;
for (k = array; k != array + size; ++k)
{
*k=rand();
}
}
void order_to_array()
{
int* k;
int i = 0;
for (k = array; k != array + size; ++k)
{
*k=i;
++i;
}
}
void rorder_to_array()
{
int* k;
int i = size;
for (k = array; k != array + size; ++k)
{
*k=i;
--i;
}
}
};
int main()
{
dynamic_array<int> d1(1000000);
d1.order_to_array();
clock_t time_start=clock();
d1.merge_sort();
clock_t time_end=clock();
double result = (double)(time_end - time_start) / CLOCKS_PER_SEC;
cout << result;
}
Worst case for quick sort is when the pivot element is the largest or smallest element in the array on every recursion. In that case you will have to do n-1 recursions (one of the arrays you split always only has one element) which gives you an O(n2) overall.
You can reproduce the worst case for quick sort if you use an already sorted array and pick the first or last element as pivot element.
Merge sort works very well for data that won't fit into memory, because each pass is linear and can be read/written to disk. Quick sort isn't even an option in that case, although the two may be combined - quick sort blocks that fit into memory, and merge sort those blocks until done.
Consider the container type as well - mergesort will work much better with a linked list, because you can split the list into equal parts by just traversing it and assigning nodes to alternate sublists; rearranging things around a pivot for quicksort is considerably more involved.