Iterative mergesort in C++ - c++

i'm currently working on iterative version of mergesort, but i've encountered a problem. The program crashes when there are specific sizes of array's like 34,35,36 or 100(just few examples) and it works for the rest(fe works for powers of 2). I've ran some tests and debugged it, and the problem seems to be with the ranges of my iterations/left,right halves of mergesort, but i can't find it. I'll be thankful for your help.
Code:
int * loadArray(int size){ //i input size of array and it assigns random numbers
int *array = new int[size];
srand( time( NULL ) );
for( int i = 0; i < size; i++ )
array[i]=rand()%100;
return array;
}
void merge(int arr[], int left, int middle, int right)
{
int i, j, k; //iterators
int n1 = middle-left + 1; //indexes
int n2 = right-middle; //indexes
int Left[n1], Right[n2]; //arrays holding halves
for (i = 0; i < n1; i++)
Left[i] = arr[left + i];//assigning values to left half
for (j = 0; j < n2; j++)
Right[j] = arr[middle + 1+ j];//assigning values to right half
i = 0;
j = 0;
k = left;
while (i < n1 && j < n2) //comparing and merging
{
if (Left[i] <= Right[j])
{
arr[k] = Left[i];
i++;
}
else
{
arr[k] = Right[j];
j++;
}
k++;
}
while (i < n1) //leftovers
{
arr[k] = Left[i];
i++;
k++;
}
while (j < n2) //leftovers
{
arr[k] = Right[j];
j++;
k++;
}
}
void mergeSortIter(int array[], int size) //the function which is being called and handles division of the array
{
double startTime = clock(); //start measuring time
int i;
int left_start;
for (i=1; i<=size-1; i = 2*i)
{
for (left_start=0; left_start<size-1; left_start += 2*i)
{
int mid = left_start + i - 1;
int right_end = min(left_start + 2*i - 1, size-1);
merge(array, left_start, mid, right_end);
}
//showArray(array,size);
}
cout << double( clock() - startTime ) / (double)CLOCKS_PER_SEC<< " seconds." << endl; //output the time measured
}

Related

Can not understand what's going wrong in my merge sort algo

Here is my code:
#include <iostream>
using namespace std;
int arr[] = { 6, 1, 9, 6, 4, 7, 3 };
int n = 7;
void merge(int l, int mid, int r) {
int n1 = mid - l + l;
int n2 = r - mid;
int arr1[n1], arr2[n2];
for (int i = 0; i < n1; i++) {
arr1[i] = arr[l + 1];
}
for (int i = 0; i < n1; i++) {
arr1[i] = arr[mid + 1 + i];
}
int i = 0, j = 0, k = l;
while (i < n1 && j < n2) {
if (arr1[i] <= arr2[j]) {
arr[k] = arr1[i];
i++, k++;
} else {
arr[k] = arr2[j];
j++;
k++;
}
}
while (i < n1) {
arr[k] = arr1[i];
i++;
k++;
}
while (j < n2) {
arr[k] = arr2[j];
j++;
k++;
}
}
void mergeSort(int l, int r) {
while (l < r) {
int mid = (l + r - 1) / 2;
mergeSort(l, mid);
mergeSort(mid + 1, r);
merge(l, mid, r);
}
}
void printArray(int arr[], int length) {
for (int i = 0; i < length; i++)
cout << arr[i] << " ";
cout << endl;
}
int main() {
printArray(arr, n);
mergeSort(0, n - 1);
printArray(arr, n);
}
I am not getting what is wrong with the code, when tried to debug it, it was calling mergeSort function, again and again with the same value.
There are many bugs hiding in plain sight in your merge function:
int n1 = mid - l + l; has an l where there should be a 1. Naming a variable l is risky as depending on the font, l looks confusingly close to 1.
arr1[i] = arr[l + 1]; should be arr1[i] = arr[l + i];
for (int i = 0; i < n1; i++) should be for (int i = 0; i < n2; i++) for the second loop.
arr1[i] = arr[mid + i]; should be arr2[i] = arr[mid + i];
Also note that it would be more consistent with C++ idioms to use the index of the element past the end of the slice instead of the index to the last element of the slice as many algorithmic books advise. This also allows for unsigned index types, such as size_t and remove the need for tricky +1 / -1 adjustments.
Here is a modified version:
#include <iostream>
using namespace std;
void merge(size_t lo, size_t mid, size_t hi) {
size_t n1 = mid - lo;
size_t n2 = hi - mid;
int arr1[n1], arr2[n2];
for (size_t i = 0; i < n1; i++) {
arr1[i] = arr[lo + i];
}
for (size_t i = 0; i < n2; i++) {
arr2[i] = arr[mid + i];
}
int i = 0, j = 0, k = lo;
while (i < n1 && j < n2) {
if (arr1[i] <= arr2[j]) {
arr[k++] = arr1[i++];
} else {
arr[k++] = arr2[j++];
}
}
while (i < n1) {
arr[k++] = arr1[i++];
}
while (j < n2) {
arr[k++] = arr2[j++];
}
}
void mergeSort(size_t lo, size_t hi) {
while (hi - lo > 1) {
size_t mid = lo + (hi - lo) / 2;
mergeSort(lo, mid);
mergeSort(mid, hi);
merge(lo, mid, hi);
}
}
void printArray(const int arr[], size_t length) {
for (size_t i = 0; i < length; i++)
cout << arr[i] << " ";
cout << endl;
}
int main() {
int arr[] = { 6, 1, 9, 6, 4, 7, 3 };
size_t n = sizeof(arr) / sizeof(*arr);
printArray(arr, n);
mergeSort(0, n);
printArray(arr, n);
return 0;
}
merge does not actually need to save the elements of the right half as they are never overwritten before they are read. Here is a simplified version:
void merge(size_t lo, size_t mid, size_t hi) {
size_t n1 = mid - lo;
size_t n2 = hi - mid;
int arr1[n1];
for (size_t i = 0; i < n1; i++) {
arr1[i] = arr[lo + i];
}
int i = 0, j = mid, k = lo;
while (i < n1) {
if (j >= hi || arr1[i] <= arr[j]) {
arr[k++] = arr1[i++];
} else {
arr[k++] = arr[j++];
}
}
}

Why is my merge sort slower than this merge sort?

I've implemented merge sort in C/C++. But my code takes longer time than the code I pulled from a website.
The recursive code seems to be exactly same for both cases:
void mergeSort(int* arr, int l, int h) {
if (l < h) {
int mid = (l + h) / 2;
mergeSort(arr,l,mid);
mergeSort(arr, mid + 1, h);
merge(arr, l, mid, h);
}
}
However the merge algorithm is a bit different, but I don't see any significant difference here.
My merge algorithm :
void merge(int *arr, int l, int mid, int h) {
int i = l, j = mid+1, k = l;
int* newSorted = new int[h+1]();
while (i <= mid && j <= h) {
if (arr[i] < arr[j])
newSorted[k++] = arr[i++];
else
newSorted[k++] = arr[j++];
}
for (; i <= mid; i++)
newSorted[k++] = arr[i];
for (; j <= h; j++)
newSorted[k++] = arr[j];
k = 0;
for (int x = l; x <= h; x++)
arr[x] = newSorted[x];
delete[] newSorted;
}
Time taken for 200000 (two hundred thousand inputs) :
17 Seconds
Merge Algorithm from a website :
void merge(int arr[], int p, int q, int r) {
int n1 = q - p + 1;
int n2 = r - q;
int* L = new int[n1];
int *M = new int[n2];
for (int i = 0; i < n1; i++)
L[i] = arr[p + i];
for (int j = 0; j < n2; j++)
M[j] = arr[q + 1 + j];
int i, j, k;
i = 0;
j = 0;
k = p;
while (i < n1 && j < n2) {
if (L[i] <= M[j]) {
arr[k] = L[i];
i++;
}
else {
arr[k] = M[j];
j++;
}
k++;
}
while (i < n1) {
arr[k] = L[i];
i++;
k++;
}
while (j < n2) {
arr[k] = M[j];
j++;
k++;
}
delete[] L;
delete[] M;
}
Time taken for 200000 (two hundred thousand inputs) :
0 Seconds
There is a massive difference in time. I don't understand the problem in my code. I would really appreciate if someone can help me figure this out. Thank you.
Your algorithm need to allocate [h+1] for each step.
The algorithm from a website only need to allocate [r-p+1]
(your h = its r, your l = its p)

Hybrid Mergesort and Insertion sort

I am trying to implement a hybrid of mergesort and insertion sort. When the subarray size reaches below a threshold, it should switch to insertion sort.
However I tried with a bunch of array of different length and different threshold amount, and most of the time there isn't any noticeable difference, other than just a 2-3 lesser comparisons. I was told that switching to insertion sort for smaller sized array would help greatly.
Am I doing it wrong?
#include <iostream>
int comparisons = 0;
int swaps = 0;
void mergesort(int x[], int l, int r);
void insertionSort(int x[],int start, int end);
int main() {
int x[] = {9,5,1,4,3,10,29,69,5,9,11,19,21,69,0,2,3,4,5,11,111,96,25,32,21,2,12,3,52,55,23,32,15,15,14,13,9,5,1,4,3,10,29,69,5,9,11,19,21,69,0,2,3,4,5,11,111,96,25,32,21,2,12,3,52,55,23,32,15,15,14,13,};
// insertionSort(x,10);
int sizeX= sizeof(x)/sizeof(x[0]) ;
mergesort(x, 0, sizeX-1);
for(int i =0;i<sizeX;i++){
std::cout << x[i] << " ";
}
// std::cout << "\nSWAPS: " << swaps;
std::cout << "\nCOMPARISONS: " << comparisons;
}
void insertionSort(int arr[], int start,int end)
{
int i, key, j;
for (i = start +1 ; i < end; i++)
{
key = arr[i];
j = i - 1;
/* Move elements of arr[0..i-1], that are
greater than key, to one position ahead
of their current position */
while (j >= 0 && arr[j] > key)
{
comparisons++;
arr[j + 1] = arr[j];
j = j - 1;
}
arr[j + 1] = key;
}
}
void insertionSort2(int x[],int start, int end){
for(int i =start; i < end;i++){
for (int j= i; j!= 0;j--){
comparisons++;
if(x[j] < x[j-1]){
int temp = x[j-1];
x[j-1] = x[j];
x[j] = temp;
swaps++;
}
else{
break;
}
}
}
}
void mergesort(int x[], int l, int r) {
if (l >= r)
return;
int mid = (l + r) / 2;
if(r - l < 3){
insertionSort(x, l,r+1);
}else{
mergesort(x, l, mid);
mergesort(x, mid + 1, r);
int i = l;
int j = mid + 1;
int k = 0;
int tmp[r - l + 1];
while (i <= mid && j <= r) {
comparisons++;
if (x[i] >= x[j]) {
tmp[k] = x[j];
j++;
} else {
tmp[k] = x[i];
i++;
}
swaps++;
k++;
}
while (i <= mid) {
tmp[k] = x[i];
i++;
k++;
}
while (j <= r) {
tmp[k] = x[j];
j++;
k++;
}
for (i = 0; i < k; i++) x[l + i] = tmp[i];
}
}

Segmentation fault:11 MergeSort

I try to implement the merge sort algorithm and I get a segmentation fault. Why? The error seems to be in the MergeSort function. The merge sort function (on the 2nd call) when should check only an array of 4 numbers (the length should be 4) shows the length = 27. Why? (tested on an array with 8 elements)
#include<iostream>
using namespace std;
int n, A[1000];
void citire(int lungime) {
for (int i = 0; i < lungime; i++) cin >> A[i];
}
void afisare(int lungime) {
for (int i = 0; i < lungime; i++)
cout << A[i] << " ";
cout << '\n';
}
int lungime(int A[]) {
int i = 0;
while (A[i]) i++;
return i;
}
void Merge(int L[], int R[], int A[]) {
int nL = lungime(L);
int nR = lungime(R);
int i = 0, j = 0, k = 0;
while (i < nL && j < nR) {
if (L[i] <= R[j]) {
A[k] = L[i];
i++;
}
else {
A[k] = R[j];
j++;
}
k++;
}
while (i < nL) {
A[k] = L[i];
i++;
k++;
}
while (j < nR) {
A[k] = R[j];
j++;
k++;
}
}
void MergeSort(int A[]) {
int n1 = lungime(A);
if (n1 < 2) return;
else
{
int mid = (int)n1 / 2;
int L[mid];
int R[n - mid];
for (int i = 0; i < mid; i++)
L[i] = A[i];
for (int i = mid; i < n; i++)
R[i - mid] = A[i];
MergeSort(L);
MergeSort(R);
Merge(L, R, A);
}
}
int main() {
cin >> n;
citire(n);
MergeSort(A);
afisare(n);
return 0;
}
Changes made in this example. A[], L[], R[] are allocated using new. A[] is passed as a parameter. L[] and R[] are allocated in Merge(). Size and/or indices passed as parameters, and lungime() is no longer used to get size. Other changes noted in comments.
#include<iostream>
using namespace std;
void citire(int A[], int lungime) { // A is parameter
for (int i = 0; i < lungime; i++) cin >> A[i];
}
void afisare(int A[], int lungime) { // A is parameter
for (int i = 0; i < lungime; i++)
cout << A[i] << " ";
cout << '\n';
}
// A, low, mid, end are parameters
// L and R allocated here
void Merge(int A[], int low, int mid, int end) {
int sizeL = mid-low;
int sizeR = end-mid;
int *L = new int[sizeL];
int *R = new int[sizeR];
for(int i = 0; i < sizeL; i++)
L[i] = A[low+i]; // A[low+i]
for(int i = 0; i < sizeR; i++)
R[i] = A[mid+i]; // A[mid+i]
int i = 0, j = 0, k = low; // k = low
while (i < sizeL && j < sizeR) {
if (L[i] <= R[j]) {
A[k] = L[i];
i++;
}
else {
A[k] = R[j];
j++;
}
k++;
}
while (i < sizeL) {
A[k] = L[i];
i++;
k++;
}
while (j < sizeR) {
A[k] = R[j];
j++;
k++;
}
delete[] R;
delete[] L;
}
// A, low, end are parameters
void MergeSort(int A[], int low, int end) {
int sizeA = end - low;
if(sizeA < 2)
return;
int mid = low + (sizeA / 2); // mid = low + ...
MergeSort(A, low, mid);
MergeSort(A, mid, end);
Merge(A, low, mid, end);
}
int main() {
int n;
cin >> n;
int *A = new int[n]; // A is allocated
citire(A, n); // A, n are parameters
MergeSort(A, 0, n); // A, 0, n are parameters
afisare(A, n); // A, n are parameters
delete[] A;
return 0;
}
"The lungime function is the length of the string and this function works good. I've tested it on different arrays".
Well, this is purely accidental; uninitialized memory can contain zeros, and provide the array terminator by accident.
If you want to keep the current design, you should:
initialize A to zeros
make sure that there are no more than 999 elements in the input stream,
that no element has the value zero, as zero is reserved, and used as terminator, and
define L and R (in MergeSort) one element longer, and initialize the last element to zero.
Unless there are overwhelming reasons for a "roll your own" sort solution, you might have a look at prefab sort support. The vector class in C++ offers just that.

Merge-Sort Algorithm. Merge function does not work

I am trying to do a Merge-Sort algorithm and I ran into a block with the merge function. I have tried several different ways of trying to fix this and followed several YouTube tutorials, but it still does not work.
Could somebody please help me figure out what's wrong with this?
mergeSort
void mergeSort(int arrayToSort[], int startIndex, int lengthToSort) {
int midIndex = 0;
if (startIndex < lengthToSort) { // if base case not reached
int midIndex = (startIndex + lengthToSort) / 2;
mergeSort(arrayToSort, startIndex, midIndex);
mergeSort(arrayToSort, (midIndex + 1), lengthToSort);
merge(arrayToSort, startIndex, lengthToSort);
}
}
Merge
void merge(int arraySortedInTwoHalves[], int startIndex, int length) {
int size = (length - startIndex) + 1;
int padding = 0;
if (size % 2 > 0) (padding = 1);
int *temp = new int[size];
int left = size / 2;
int right = (size / 2) + padding;
int i = 0;
int j = (size / 2) + padding;
int k = 0;
while ((i < left) && (j < length)) {
if (arraySortedInTwoHalves[i] <= arraySortedInTwoHalves[j]) {
temp[k] = arraySortedInTwoHalves[i];
i++;
}
else {
temp[k] = arraySortedInTwoHalves[j];
j++;
}
k++;
while (i < left) {
temp[k] = arraySortedInTwoHalves[i];
k++;
}
while (j < length) {
temp[k] = arraySortedInTwoHalves[j];
j++;
}
}
}
Main
int main() {
// setup an array of random numbers of size n
const int arrSize = 50;
int nums[arrSize];
for (int i = 0; i <= arrSize; i++) {
nums[i] = rand() % arrSize;
}
mergeSort(nums, 0, arrSize-1);
for (int i = 0; i < arrSize; i++) {
std::cout << nums[i] << " ";
}
return(0);
}
Solution
Here's a complete solution I made, just in case it is helpful to anybody else...
#include <iostream>
#include <random>
#include <ctime>
void mergeSort(int arrayToSort[], int startIndex, int lengthToSort);
void merge(int arraySortedInTwoHalves[], int startIndex, int length);
int main() {
// setup an array of random numbers of size n
const int arrSize = 10000;
int nums[arrSize];
for (int i = 0; i <= arrSize; i++) {
nums[i] = rand() % arrSize;
}
// just a timer to measure performance
int start_s = clock();
mergeSort(nums, 0, arrSize-1);
for (int i = 0; i < arrSize; i++) {
std::cout << nums[i] << " ";
}
// stop timer
int stop_s = clock();
std::cout << std::endl << std::endl << "Executed In: " << (stop_s - start_s) / double(CLOCKS_PER_SEC) << "s\n" << std::endl;
system("pause");
return(0);
}
void mergeSort(int arrayToSort[], int startIndex, int lengthToSort) {
int midIndex = 0;
if (startIndex < lengthToSort) { // if base case not reached
midIndex = (startIndex + lengthToSort) / 2;
mergeSort(arrayToSort, startIndex, midIndex);
mergeSort(arrayToSort, (midIndex + 1), lengthToSort);
merge(arrayToSort, startIndex, lengthToSort);
}
}
void merge(int arraySortedInTwoHalves[], int startIndex, int length) {
int size = (length - startIndex) + 1;
int *temp = new int[size]; // temp array to hold elements
int left = startIndex; // left side of the array
int midIndex = (startIndex + length) / 2; // border
int right = midIndex + 1; // right side of the array
int i = 0;
while ((left <= midIndex) && (right <= length)) { // while there are elements in both sides...
if (arraySortedInTwoHalves[left] < arraySortedInTwoHalves[right]) { // add whichever is lower from the appropriate side
temp[i++] = arraySortedInTwoHalves[left++];
}
else {
temp[i++] = arraySortedInTwoHalves[right++];
}
}
while (left <= midIndex) // if one runs out...
{
temp[i++] = arraySortedInTwoHalves[left++];
}
while (right <= length) // if one runs out...
{
temp[i++] = arraySortedInTwoHalves[right++];
}
for (i = 0; i < size; i++) { // copy elements to the original array
arraySortedInTwoHalves[startIndex + i] = temp[i]; // startIndex + i because of recursion
}
delete []temp; // delete temp array
}
Looks like your bug is here
}
k++;
while (i < left) {
temp[k] = arraySortedInTwoHalves[i];
k++;
}
while (j < length) {
temp[k] = arraySortedInTwoHalves[j];
j++;
}
}
This should be
}
k++;
} // Moved brace from end to here.
// These two loops should be after the main loop.
while (i < left) {
temp[k] = arraySortedInTwoHalves[i];
// You forgot to increment i
k++;
}
while (j < length) {
temp[k] = arraySortedInTwoHalves[j];
j++;
// You forgot to increment k
}
Its hard to tell if there are other errors but I suspect there are more here.
By using the length here (lengthToSort).
void mergeSort(int arrayToSort[], int startIndex, int lengthToSort) {
The point arrayToSort + lengthToSort is one past the end. This sort of indicates that you are using the standard C++ idiom of [begin,end).
This next call seems to follow that convention.
mergeSort(arrayToSort, startIndex, midIndex);
But this calls seems to indicate that length is inclusive. As you are excluding midIndex from the range (which means it should be in the call above). But that means lengthToSort should also be in the range below but that is not the implication of the original call.
mergeSort(arrayToSort, (midIndex + 1), lengthToSort);
Are your ranges correct? To help make this explicit a lot of implementations make this explicit by using the interface.
mergeSort(begin, mid); // begin, mid and end are iterators.
mergeSort(mid, end);
merge(begin, mid, end);