Dynamic memory allocation in merge sort - c++

I wrote a program for merge sort (i wrote the basic algorithm) - and it works fine. However, since I have to read the integers from a very large file I wanted to declare the array dynamically in the recursive calls . hence i wrote the following code , however it is giving me some errors, could you please help me identify where i am making the mistake?
The program is actually to count the number of inversions in an array ( if i < j and arr[i]>arr[j] , then this is an inversion). The program I have written is as below :
I dont want to declare a array of 10000 integers on stack everytime i go in recursive calls
The error i get is : std::bad_alloc at memory location 0x004dd940..
i have edited the question so it includes the error message. The execution breaks and visual studio goes into debug mode and opens a file osfinfo.c
#include<stdio.h>
#include <iostream>
using namespace std;
unsigned int mixAndCount(int * arr,int low, int mid,int high) {
int *num = new int[high-low+1];// THIS IS WHERE THE ERROR OCCURS
int l = low ;
int r = mid+1;
unsigned int count=0;
int i =low;
while((l<=mid)&&(r<=high))
{
if(arr[l]<=arr[r])
{
num[i]=arr[l];
l++;
}
else
{
num[i]=arr[r];
r++;
count=count + (mid-l+1);
}
i++;
}
if(l>mid)
{
for(int k=r;k<=high;k++)
{
num[i]=arr[k];
i++;
}
}
else
{
for(int k=l;k<=mid;k++)
{
num[i]=arr[k];
i++;
}
}
for(int k=low;k<=high;k++) arr[k]=num[k];
delete[] num;
return count;
}
unsigned int mergeAndCount(int * arr, int low , int high ) {
if(low>=high) {
return 0;
}
else {
int mid = (low+high)/2;
unsigned int left = mergeAndCount(arr, low , mid);
unsigned right = mergeAndCount(arr, mid+1, high);
unsigned int split = mixAndCount(arr, low , mid , high);
return left+right+split;
}
}
int main ()
{
int numArr[100000];
FILE * input = fopen("IntegerArray.txt", "r");
int i =0;
while(!feof(input)) {
int num;
fscanf(input, "%d", &num);
numArr[i] = num;
i++;
}
fclose(input);
unsigned int count = mergeAndCount(numArr,0, i-1 );
cout<<count<<endl;
return 0;
}

std::bad_alloc at memory location 0x004dd940..
Is an exception thrown by new when it cannot allocate requested memory successfully.
int *num = new int[high-low+1];
It seems the requested memory size is too large, which means you need to track values of high, low.

Be aware of dynamic memory allocation. Its really slower. Consider twice before you will leave your code in this form. You can make a simple testcase with std::chrono
http://en.cppreference.com/w/cpp/chrono/duration
you dont need dynamic allocation, everything is done in one local namespace.

Related

Randomly Shuffle an array and using quick sort algorithm

I have been trying to write a code to randomly shuffle the array elements, and then use the quick sort algorithm on the array elements. This is the code I wrote:
#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;
void swap(int *a, int *b)
{
int temp = *a;
*a = *b;
*b = temp;
}
void randomise(int arr[], int s, int e)
{
int j;
srand(time(NULL));
for (int i = e; i >= s; i--)
{
int j = rand() % (i + 1);
swap(&arr[i], &arr[j]);
}
}
int Partition(int arr[], int s, int e)
{
int pivot = arr[e];
int i = s - 1;
int j;
for (j = s; j <= e; j++)
{
if (arr[j] <= pivot)
{
i++;
swap(&arr[i], &arr[j]);
}
}
swap(&arr[i + 1], &arr[e]);
return i + 1;
}
void QuickSort(int arr[], int s, int e)
{
if (s >= e)
return;
int x = Partition(arr, s, e);
QuickSort(arr, s, x - 1);
QuickSort(arr, x + 1, e);
}
int main()
{
int b[] = {1, 2, 3, 4, 5};
randomise(b, 0, 4);
cout << "Elements of randomised array are:";
for (int i = 0; i <= 4; i++)
{
cout << b[i] << endl;
}
QuickSort(b, 0, 4);
cout << "Elements after quick sort are:";
for (int i = 0; i <= 4; i++)
{
cout << b[i] << endl;
}
return 0;
}
However, on debugging on GDB, I found out that this program gives segmentation fault. On execution, this program gives the output as:
Elements of randomised array are:4
5
2
3
1
Can someone tell me what is the bug in this code (I have tried debugging it on GDB, but I am still clueless).
Basically, when the error is segmentation fault, you should be looking for a bug which you will feel like crashing your head into wall, after finding it. On line 26. change <=, to < . It's in your partition function. for (j = s; j < e; j++)
A little explanation about quick sort; After each time quickSort function runs on a partiotion, the last element of the partition, called pivot, will reach its' real place in array. The partition function, returns the real place of the pivot in the array. Then the main array will be split into two more partitions, before the pivot place, and after that. Your bug is returning real-pivot-place + 1, as the output of partition function. So you will run quickSort on wrong partition; the partition that is already sorted but the program will keep trying to sort it over and over because of wrong partitioning. As you may know, each time you run a function, its' variables will be saved into a stack in computer. Since your calling a recursive function over and over(that isn't supposed to stop), this stack will get full and will overflow. After that, computer will represent some undefined behavior and maybe throw an exception that can not describe the problem correctly. This is why your getting segmentation fault. But why you return real-pivot-place + 1? Because in your for loop in partition function, you will visit the pivot too, which you shouldn't. Because pivot isn't supposed to be compared with itself. So you will increase i variable unnecessarily. https://en.wikipedia.org/wiki/Call_stack Check this link for additional information about stack and how a function runs in computer.

Quicksort Algorithm,Incorrect answer and segmentation fault with some specific input sequences

I am learning c++ and I am relatively new to programming. I wrote a C++ program that implements the quick sort algorithm using the last element as the pivot. Whenever I try to execute it, the answer is always wrong and for some specific input sequences I get a segmentation fault error.
I have tried playing around with the while loop and changing it to "if" statements to see if anything happens. The results change but they are incorrect.
// Example program
#include <iostream>
using namespace std;
int partition(int a[],int l,int r)
{
//int l=0,r=p-1;
int p=r+1;
while(r>l)
{
while (a[l]<a[p])
{
l=l+1;
}
while(a[r]>a[p])
{
r=r-1;
}
//if(a[l]>a[r]){
int f=a[r];
a[r]=a[l];
a[l]=f;
//}
}
int k=a[l];
a[l]=a[p];
a[p]=a[l];
p=l;
return p;
}
void quicksort(int a[],int l,int r)
{
int p;
if (l<r){
p=partition(a,l,r);
quicksort(a,0,p-2);
quicksort(a,p+1,r);
}
}
int main(){
int k;
cout<<"enter the number of elements in array";
cin>>k;
int a[k];
for (int i=0;i<k;i++)
{
cin>>a[i];
}
//int p=k-1;
int l=0;
int r=k-2;
quicksort(a,l,r);
for (int i=0;i<k;i++)
{
cout<<a[i];
}
return 0;
}
actual results:
enter the number of elements in array
4
3
0
1
2
sorted result
1322
expected results:
0123
If the posted code is compiled with warnings enabled, the following diagnostic is produced:
prog.cc:25:9: warning: unused variable 'k' [-Wunused-variable]
int k=a[l];
^
prog.cc:47:10: warning: variable length arrays are a C99 feature [-Wvla-extension]
int a[k];
^
The first one is generated by what seems to be a typo in function partition:
int k=a[l];
a[l]=a[p];
a[p]=a[l]; // <-- That should be 'a[p] = k;' to swap the values
Of course, the proper way of swapping those two values should be
std::swap(a[l], a[p]);
The second warning is easily fixed by using the proper data structure, which in C++ is a std::vector and passing a reference to it to the other functions, instead of a int *.
Those aren't the only issues in OP's code, which seems to implement a variant of the Quicksort algorithm using the Lomuto partition scheme.
In OP's code the first call is something like
quicksort(a, 0, k - 2); // k beeing the size of the VLA, it skips the last element
While, using a vector and following the convention of denoting a range by its first element and the one past the end, we could write the entry point as
// Note that std::vector::size() returns an unsigned type
quicksort(a, 0, a.size());
So that the quicksort function could be implemented as
void quicksort(std::vector<int> &a, size_t low, size_t high)
{
if ( low < high) {
size_t p = partition(a, low, high);
quicksort(a, low, p); // <- Note that OP's code uses '0' instead of 'low'
quicksort(a, p + 1, high);
}
}
If I correctly guessed the variant which the OP is trying to implement, the partition function could be simplified (and fixed) to
size_t partition(std::vector<int> &a, size_t low, size_t high)
{
size_t p = high - 1; // <- Assumes high > 0
size_t i = low;
for( size_t j = low; j < p; ++j )
{
if(a[j] < a[p]) {
std::swap(a[i], a[j]);
++i;
}
}
std::swap(a[i], a[p]);
return i;
}

C++ Counting inversions in array, Fatal Signal 11 (BIT)

I was given this challenge in a programming "class". Eventually I decided to go for the "Binary Indexed Trees" solution, as data structures are a thing I'd like to know more about. Implementing BIT was somewhat straight forward, things after that - not so much. I ran into "Fatal Signal 11" when uploading the solution to the server, which, from what I've read, is somewhat similar to a Null pointer exception. Couldn't figure out the problem, decided to check out other solutions with BIT but stumbled upon the same problem.
#include<iostream>
using namespace std;
/* <BLACK MAGIC COPIED FROM geeksforgeeks.org> */
int getSum(int BITree[], int index){
int sum = 0;
while (index > 0){
sum += BITree[index];
index -= index & (-index);
}
return sum;
}
void updateBIT(int BITree[], int n, int index, int val){
while (index <= n){
BITree[index] += val;
index += index & (-index);
}
}
/* <BLACK MAGIC COPIED FROM geeksforgeeks.org> */
int Count(int arr[], int x){
int sum = 0;
int biggest = 0;
for (int i=0; i<x; i++) {
if (biggest < arr[i]) biggest = arr[i];
}
int bit[biggest+1];
for (int i=1; i<=biggest; i++) bit[i] = 0;
for (int i=x-1; i>=0; i--)
{
sum += getSum(bit, arr[i]-1);
updateBIT(bit, biggest, arr[i], 1);
}
return sum;
}
int main(){
int x;
cin >> x;
int *arr = new int[x];
for (int temp = 0; temp < x; temp++) cin >> arr[temp];
/*sizeof(arr) / sizeof(arr[0]); <-- someone suggested this,
but it doesn't change anything from what I can tell*/
cout << Count(arr,x);
delete [] arr;
return 0;
}
I am quite stumped on this. It could be just some simple thing I'm missing, but I really don't know. Any help is much appreciated!
You have condition that every number lies between 1 and 1018. So, your biggest number can be 1018. This is too much for the following line:
int bit[biggest+1];

incorrect checksum for freed object on large input

I've written a program which computes the number of inversions in a .txt file (first number - amount of numbers, than go numbers themselves). On small input (5 or 10 numbers) it works fine, but when the input is 100,000 numbers (and each number less than 100,000) I get the following error:
incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug***
Here is the code:
#include <stdio.h>
long int merge(int *arr, const int start, const int half, const int end)
{
int s=start;
int i=0;
int cinv=0;
int j=half+1;
int* barr = new int[end+start-1];
while((s<=half)&&(j<=end)){
if(arr[s]<=arr[j]){
barr[i]=arr[s];
s++;
}
else{
barr[i]=arr[j];
j++;
cinv++;
}
i++;
}
if(s>half){
for(int k = j;k<=end;k++){
barr[i]=arr[k];
i++;
}
}
else{
for(int k=s;k<=half;k++){
barr[i]=arr[k];
i++;
}
}
for(int k=0;k<=end-start;k++) {
arr[k+start]=barr[k];
}
delete[] barr;
return cinv;
}
long int mergesort(int* arr, int start, int end){
int half=(start+end)/2;
long int cinv=0;
if (start<end){
cinv+=mergesort(arr, start, half);
cinv+=mergesort(arr, half+1, end);
cinv+=merge(arr, start, half, end);
return cinv;
}
return cinv;
}
int main(){
int len;
freopen("input.txt", "rt", stdin);
freopen("output.txt", "wt", stdout);
scanf("%d", &len);
int *arr= new int[len];
for (int i=0; i<len; i++){
scanf("%d", &arr[i]);
}
long int cinv=mergesort(arr, 0, len-1);
printf("\nInversions with merge=%ld", cinv);
delete [] arr;
return 0;
}
Thanks in advance for your help.
The dimension of your temporary array in merge,
int* barr = new int[end+start-1];
is not correct. When you call merge with start == 0 and end == 1, this will yield an array dimension of 0. At the other end of the array, it will allocate twice as much memory as needed. Change this to:
int* barr = new int[end - start + 1];
What allocating zero bytes does, is implementation-defined. Your program crashes reliably on my Linux platform even with small input arrays.

Floating point exception

#include <cstdio>
#include <ctime>
int populate_primes(int array[])
{
const int max = 1000000;
char numbers[max+1];
int count=1;
array[0]=2;
for(int i=max;i>0;i-=2)numbers[i]=0;
for(int i=max-1;i>0;i-=2)numbers[i]=1;
int i;
for(i=3;i*i<=max;i+=2){
if(numbers[i]){
for(int j=i*i;j<max+1;j+=i)numbers[j]=0; array[count++]=i;
}
}
int limit = max/2;
for(;i<limit;i++) if(numbers[i])array[count++]=i;
return count;
}
int factorize(int number,int array[])
{
int i=0,factor=1;
while(number>0){
if(number%array[i]==0){
factor++;
while(number%array[i]==0)number/=array[i];
}
i++;
}
printf("%d\n",factor);
return factor;
}
int main()
{
int primes[42000];
const int max = 1000000;
int factors[max+1];
clock_t start = clock();
int size = populate_primes(primes);
factorize(1000,primes);
printf("Execution time:\t%lf\n",(double)(clock()-start)/CLOCKS_PER_SEC);
return 0;
}
I am trying to find the no. of factors using simple algo. The populate primes part is running okay , but the factorize part does not execute and gives the floating point exception error.
Please see the code and tell my mistake.
In your factorize method you access array[0], because the initial value of i is 0.
This array is the primes array which is populated by populate_primes. But populates prime doesn't write to primes[0], since the initial value of count is 1.
Thus the first element is not initialized and you probably get a div by 0 error.
You need to pass the size which you got from populate to factorize.
factorize(int number, int array[], int size);
problem is your array[] is not fully loaded, it is loaded only till size variable. So you may want to check for that.
Also the logic inside factorize is wrong. You need to check (number > 1) rather than (number >0).
Try with the function below to see some problems:
#define MAX_PRIMES 42000
int factorize(int number,int array[])
{
int i=0,factor=1;
for (i=0; number>0 && i< MAX_PRIMES; i++){
if (array[i] == 0 || array[i] == 1) {
printf("Error: array[%d] = %d\n", i, array[i]);
} else {
if(number%array[i]==0){
factor++;
while(number%array[i]==0 && number>0) {
printf("%d %d\n", number, array[i]);
number/=array[i];
}
}
}
}
printf("%d\n",factor);
return factor;
}