Counting inversions in C++ via mergesort - c++

I have a text file that contains 100,000 numbers from 1 ~ 100,000 in an unsorted manner (no duplicates).
The name of the file is "IntegerArray.txt"
My task is to count the number of inversions in the text file. An inversion is a pair of elements a, b where a comes before b, but a > b. Thus "1 2 3 6 4 5" contains two inversion (6 > 4 and 6 > 5).
I implemented it with the merge sorting method; the sorting works, but the counting inversions part kept giving out wrong answers, and I cannot find why.
The following is my code:
long long mergeAndCount(vector<int>& vec, int p, int q, int r) {
long long count = 0;
vector<int> L, R;
for (int i = p; i <= q; ++i) L.push_back(vec[i]);
for (int j = q + 1; j <= r; ++j) R.push_back(vec[j]);
L.push_back(numeric_limits<int>::max()); //sentinel element
R.push_back(numeric_limits<int>::max());
int i = 0, j = 0;
for (int k = p; k <= r; ++k) {
if (L[i] <= R[j]) {
vec[k] = L[i];
++i;
} else {
vec[k] = R[j];
++j;
if (L[i] != L.back() && R[j] != R.back())
// Problem SOLVED: change this line to count += q - p + 1 - i
count += q - i + 1;
}
}
return count;
}
long long inversion(vector<int>& vec, int p, int r) {
long long count = 0;
if (p < r) {
int q = (p + r) / 2;
count = inversion(vec, p, q);
count += inversion(vec, q + 1, r);
count += mergeAndCount(vec, p, q, r);
}
return count;
}
int main() {
ifstream infile("IntegerArray.txt");
int a;
vector<int> vec;
while (infile >> a)
vec.push_back(a);
cout << inversion(vec, 0, vec.size()-1);
return 0;
}
The result from the above code is 32620796130, which is incorrect.
The answer by brute force with the following code is 2407905288, which is correct.
long long inversion(vector<int>& vec, int p, int r) {
long long count = 0;
for (int i = 0; i < vec.size(); ++i)
for (int j = i + 1; j < vec.size(); ++j)
if (vec[i] > vec[j])
++count;
return count;
}
Can someone help me out solving this?

Related

Why is my merge sort slower than this merge sort?

I've implemented merge sort in C/C++. But my code takes longer time than the code I pulled from a website.
The recursive code seems to be exactly same for both cases:
void mergeSort(int* arr, int l, int h) {
if (l < h) {
int mid = (l + h) / 2;
mergeSort(arr,l,mid);
mergeSort(arr, mid + 1, h);
merge(arr, l, mid, h);
}
}
However the merge algorithm is a bit different, but I don't see any significant difference here.
My merge algorithm :
void merge(int *arr, int l, int mid, int h) {
int i = l, j = mid+1, k = l;
int* newSorted = new int[h+1]();
while (i <= mid && j <= h) {
if (arr[i] < arr[j])
newSorted[k++] = arr[i++];
else
newSorted[k++] = arr[j++];
}
for (; i <= mid; i++)
newSorted[k++] = arr[i];
for (; j <= h; j++)
newSorted[k++] = arr[j];
k = 0;
for (int x = l; x <= h; x++)
arr[x] = newSorted[x];
delete[] newSorted;
}
Time taken for 200000 (two hundred thousand inputs) :
17 Seconds
Merge Algorithm from a website :
void merge(int arr[], int p, int q, int r) {
int n1 = q - p + 1;
int n2 = r - q;
int* L = new int[n1];
int *M = new int[n2];
for (int i = 0; i < n1; i++)
L[i] = arr[p + i];
for (int j = 0; j < n2; j++)
M[j] = arr[q + 1 + j];
int i, j, k;
i = 0;
j = 0;
k = p;
while (i < n1 && j < n2) {
if (L[i] <= M[j]) {
arr[k] = L[i];
i++;
}
else {
arr[k] = M[j];
j++;
}
k++;
}
while (i < n1) {
arr[k] = L[i];
i++;
k++;
}
while (j < n2) {
arr[k] = M[j];
j++;
k++;
}
delete[] L;
delete[] M;
}
Time taken for 200000 (two hundred thousand inputs) :
0 Seconds
There is a massive difference in time. I don't understand the problem in my code. I would really appreciate if someone can help me figure this out. Thank you.
Your algorithm need to allocate [h+1] for each step.
The algorithm from a website only need to allocate [r-p+1]
(your h = its r, your l = its p)

Errors Within Void Merge

I've been having problems trying to figure out how to fix this code I wrote for Mergesort.
The intended result was to output a sorted array of inputs, but the void merge function contains errors that result in either an unsorted array or an array of really large or small numbers.
I've tried many times to fix them, but the result still doesn't come out perfectly.
Can you look it over and tell me what I've been doing wrong?
#include "pch.h"
#include <iostream>
using namespace std;
void merge(int* arr, int p, int q, int r) {
//copy A[p.q] into L
//and A[q+1.r] into R
int i, j, k;
int n1 = q - p + 1;
int n2 = r - q;
int* L = new int[n1+1];
int* R = new int[n2+1];
for (i = 1; i <= n1; i++) {
L[i] = arr[p+i-1];
}
for (j = 1; j <= n2; j++){
R[j] = arr[q+j];
}
L[n1+1] = 99999;
R[n2+1] = 99999; //represents infinity
i = j = 1;
for (k = p; k <= r; k++)
{
if (L[i] <= R[j]) {
arr[k] = L[i];
i = i + 1;
}
else {
arr[k] = R[j];
j = j + 1;
}
return;
}
}
void mergesort(int* arr, int p, int r) {
if (p < r) {
int q = floor((p + r) / 2);
mergesort(arr, p, q);
mergesort(arr, q + 1, r);
merge(arr, p, q, r);
}
return;
}
int main() {
int r;
cin >> r;
int* arr = new int[r];
for (int i = 0; i < r; i++) {
int num;
cin >> num;
arr[i] = num;
}
int p = 0;
//sortint function
mergesort(arr,p,r);
for (int i = 0; i < r; i++) {
cout << arr[i] << ";";
}
return 0;
}

failure in QVector<T>::operator[]: "index out of range" is thrown unexpectedly

I have two sorting methods: insertion sort and shell sort. Two of those working function I have adapted to C++ from plain C. The problem is that ins_sort function works just well and shell_sort fails. What reason for that can be?
bool less(QVector<int> &arr, int a, int b)
{
return arr[a] < arr[b];
}
// Performs swap on elements at a and b in QVector<int> arr
void qswap(QVector<int> &arr, int a, int b)
{
int temp = arr[a];
arr[a] = arr[b];
arr[b] = temp;
}
/* Failure is thrown in this method */
void shell_sort(GraphicsView &window, SwapManager &manager)
{
auto list = window.items();
QVector<int> arr;
for (auto item : list)
arr.push_back(static_cast<QGraphicsRectWidget*>(item)->m_number);
int N = arr.size();
int h = 1;
while (h < N/3) h = 3*h + 1;
while (h >= 1)
{
for (int i = h; i < N; ++i)
{
for (int j = i; less(arr, j, j-h) && j >= h; j -= h)
{
qswap(arr, j, j-h);
manager.addPair(j, j - h);
}
}
h /= 3;
}
}
And that one does well.
/* This method works just fine */
void ins_sort(GraphicsView &window, SwapManager &manager)
{
auto list = window.items();
int i, j;
QVector<int> arr;
for (auto item : list)
{
arr.push_back(static_cast<QGraphicsRectWidget*>(item)->m_number);
}
int N = arr.size();
for (i = 1; i < N; ++i)
{
for (j = i - 1; j != -1 && less(arr, j + 1, j); --j)
{
qswap(arr, j, j + 1);
manager.addPair(j, j + 1);
}
}
}
Debugger points to this piece of code in "qvector.h"
Q_ASSERT_X(i >= 0 && i < d->size, "QVector<T>::operator[]", "index out of range");
return data()[i]; }
In the for-loop condition there is sense to check j value before comparing items:
for (int j = i; j >= h && less(arr, j, j-h); j -= h)

Designing MERGE-SORT Algorithm - VERY WEIRD ISSUE ! "std::bad_alloc at memory location 0x00486F78."

This is for an assignment in an algorithm class. I understand and agree that using a vector would simplify things, but that isn't an option.
The code for the Mergesort / merge algorithm can't be modified either.
I need to run the merge sort as follows:
starting from 100 all the way to 1000, increments of 100. For each increment I run it 5 times, for each of these times I run it 1000 times.
That being said - everything works fine until my loop reaches 700 and crashes with the error: "Unhandled exception at 0x75612F71 in msdebug.exe: Microsoft C++ exception: std::bad_alloc at memory location 0x010672F4."
Here is my code:
int const size = 6;
int const size2 = 1001;
int const times = 6;
int const interval = 11;
void merge(int arr[], int p, int q, int r)
{
int n1 = q - p + 1;
int n2 = r - q;
int * L = new int[n1 + 1];
int * R = new int[n2 + 1]; // line giving the error after 700
for (int i = 1; i <= n1; i++)
{
L[i] = arr[p + i - 1];
}
for (int j = 1; j <= n2; j++)
{
R[j] = arr[q + j];
}
L[n1 + 1] = 32768;
R[n2 + 1] = 32768;
int i, j;
i = j = 1;
for (int k = p; k <= r; k++)
{
if (L[i] <= R[j])
{
arr[k] = L[i];
i++;
}
else
{
arr[k] = R[j];
j++;
}
}
}
void mergeSort(int arr[], int p, int r)
{
int q;
if (p < r)
{
q = ((p + r) / 2);
mergeSort(arr, p, q);
mergeSort(arr, (q + 1), r);
merge(arr, p, q, r);
}
}
void copyArray(int original[][size2], int copy[], int row, int finish)
{
int i = 1;
while (i <= finish)
{
copy[i] = original[row][i];
i++;
}
}
void copyOneD(int orig[], int cop[])
{
for (int i = 1; i < size2; i++)
{
cop[i] = orig[i];
}
}
int main()
{
struct timeval;
clock_t start, end;
srand(time(NULL));
int arr[size][size2];
int arr2[size2];
int arrCopy[size2];
double tMergeSort[times][interval];
double avgTmergeSort[11];
/*for (int i = 1; i < (size2); i++)
{
arr2[i] = rand();
}*/
for (int i = 1; i < size; i++)
{
for (int j = 1; j < size2; j++)
{
arr[i][j] = rand();
}
}
for (int x = 100; x <= 1000; x = x + 100) //This loop crashes >=700
{
for (int r = 1; r <= 5; r++)
{
copyArray(arr, arr2, r, 1001);
for (int k = 0; k < 1000; k++)
{
copyOneD(arr2, arrCopy);
mergeSort(arrCopy, 1, x);
}
}
}
return 0;
}
You can ignore the code and the arrays. Those functions work fine.
Everything works fine until I set 'x <= 700' or higher and then it crashes.
I had a theory that maybe the computer runs out of memory for the pointers in the merge algorithm but when I tried to use delete it also crashed.
Any help is appreciated and suggestions as well.
Thanks

How to get BucketSort algorithm working?

I am trying to create a bucketsort algorithm in C++, but it is not working at all. Every run, it adds many new numbers, often very large, such as in the billions, into the array. Does anyone know why this is? Here is the code - (Note that I am passing in an array of size 100, with random numbers from 0 to ~37000, and that the insertion sort function is fully functional and tested multiple times)
It would be greatly appreciated if someone could point out what's wrong.
void bucketSort(int* n, int k)
{
int c = int(floor(k/10)), s = *n, l = *n;
for(int i = 0; i < k; i++) {
if(s > *(n + i)) s = *(n + i);
else if(l < *(n + i)) l = *(n + i);
}
int bucket[c][k + 1];
for(int i = 0; i < c; i++) {
bucket[i][k] = 0;
}
for(int i = 0; i < k; i++) {
for(int j = 0; j < c; j++) {
if(*(n + i) >= (l - s)*j/c) {
continue;
} else {
bucket[j][bucket[j][k]++] = *(n + i);
break;
}
}
}
for(int i = 0; i < c; i++) {
insertionSort(&bucket[i][0], k);
}
}
This line does not compile. int bucket[c][k + 1];
I think the problem is with you bucket indices. This part here
for(int j = 0; j < c; j++) {
if(*(n + i) >= (l - s)*j/c) {
continue;
} else {
bucket[j][bucket[j][k]++] = *(n + i);
break;
}
}
does not do the equivalent of:
insert n[i] into bucket[ bucketIndexFor( n[i]) ]
First it gets the index off by one. Because of that it also misses the break for the numbers for the last bucket. There is also a small error introduced because the index calculation uses the range [0,l-s] instead of [s,l], which are only the same if s equals 0.
When I write bucketIndex as:
int bucketIndex( int n, int c, int s, int l )
{
for(int j = 1; j <= c; j++) {
if(n > s + (l-s)*j/c) {
continue;
} else {
return j-1;
}
}
return c-1;
}
and rewrite the main part of your algorithm as:
std::vector< std::vector<int> > bucket( c );
for(int i = 0; i < k; i++) {
bucket[ bucketIndex( n[i], c, s, l ) ].push_back( n[i] );
}
I get the items properly inserted into their buckets.