Using quicksort to sort strings - c++

I am having issues implementing a quicksort to sort an array of strings. I am also relatively new to c++ so still struggling with some issues there. Right now my code correctly reads in and creates an array of strings but I run into problems when I try to use my quicksort algorithm. The first problem I am running into is that I believe that the recursion is not stopping when it should. I get a segmentation fault after the quicksort runs for a little bit.
Code (Modified):
#include "MyParser.h"
#include <iostream>
#include <fstream>
#include <string>
void resize(string*& words, int size)
{
string* newArray = new string[size*2];
for (int i = 0; (i < size)&&(i<size*2); i++)
newArray[i] = words[i];
for (int i = size; i < size*2; i++)
newArray[i] = "";
delete[] words;
words = newArray;
}
void partitionArray(string*& words, int& left, int& right, int pi)
{
int i = left;
int j = right;
string tmp;
string pivot = words[pi];
while (i < j) {
string str1 = words[i];
string str2 = words[j];
while ((str1.compare(pivot) < 0) && (i < right))
i++;
while ((str2.compare(pivot) >= 0) && (j > left))
j--;
if (i <= j)
{
tmp = words[i];
words[i] = words[j];
words[j] = tmp;
i++;
j--;
}
};
}
void quickSort(string*& words, int left, int right)
{
int i = left;
int j = right;
string tmp;
string pivot = words[(left + right) / 2];
/* partition */
int pivotIndex = (left + right) / 2;
pivotIndex = partitionArray(words, 0, right, pivotIndex);
cout << "start recursion" << endl;
/* recursion */
if (left < j)
quickSort(words, left, j);
if (i < right)
quickSort(words, i, right);
}
int main()
{
// define file reader
ofstream outData;
outData.open("logData.txt");
Parser* myParser = new Parser("testData.txt");
int sizeOfArray = 500;
string* words = new string[sizeOfArray];
int index = 0;
while(myParser->hasTokens())
{
if (index >= sizeOfArray)
{
resize(words, sizeOfArray);
sizeOfArray = sizeOfArray*2;
}
string currentWord = myParser->nextToken();
if (currentWord != "")
{
words[index] = currentWord;
index++;
}
}
int lastWordInArrayIndex = index;
quickSort(words, 0, lastWordInArrayIndex);
return 0;
}
Any help on this would be greatly appreciated!
MODIFIED
right now it will sort the following 11 elements correctly:
adfgh
btyui
dfghj
eerty
fqwre
kyuio
verty
wwert
yrtyu
zbsdf
zsdfg
but when attempting to sort the following parsed text, free from all delimiters but worse "like-this" with a single hyphen or words with an apostrophe like "they're", it does not terminate:
Three days after the quarrel, Prince Stepan Arkadyevitch
Oblonsky--Stiva, as he was called in the fashionable world--
woke up at his usual hour, that is, at eight o'clock in the
morning, not in his wife's bedroom, but on the leather-covered
sofa in his study. He turned over his stout, well-cared-for
person on the springy sofa, as though he would sink into a long
sleep again; he vigorously embraced the pillow on the other side
and buried his face in it; but all at once he jumped up, sat up
on the sofa, and opened his eyes.
"Yes, yes, how was it now?" he thought, going over his dream.
"Now, how was it? To be sure! Alabin was giving a dinner at
Darmstadt; no, not Darmstadt, but something American. Yes, but
then, Darmstadt was in America. Yes, Alabin was giving a dinner
on glass tables, and the tables sang, Il mio tesoro--not Il mio
tesoro though, but something better, and there were some sort of
little decanters on the table, and they were women, too," he
remembered.
Again any help with this issue would be greatly appreciated!

Your quickSort function will indeed recurse indefinitely:
void quickSort(string*& words, int left, int right)
{
int i = left;
int j = right;
...
if (left < j)
quickSort(words, left, j);
if (i < right)
quickSort(words, i, right);
}
i, j, left and right are not modified anywhere in that function, so if left < right the function will be called recursively with the same parameters again and again.

Related

C++ Merge Sort Visualizer

I am trying to make a c++ console application that tries to show you how merge sort looks like. I understand merge sort, and I created a program that organizes a vector of strings called sort_visualize, and each string in it is filled with many #. This is completely randomized for every string. The merge sort will organize them depending on length, instead of the traditional number organizing people do with it. Every time I make a change to the vector, I also clear the screen and print out the entire vector through a draw function, to give the effect of it actively visualizing the sort every frame. The problem is that when I use the draw function to print out the entire sort_visualize string, it does not print out any changes that I have made to it, and prints out the same thing over and over again until the end, when it finally prints the sorted order. What is going on? I Don't understand. I even tried changing the draw(sort_visualize) to draw(sort_visualize_), and that shows small areas of the vector it is working on. Makes no sense. Please try this code and tell me any solutions. Thank you.
Here's the code:
#include <vector>
#include <iostream>
#include <ctime>
#include "stdlib.h"
#include "windows.h"
using namespace std;
void merge_sort(vector<string> &sort_visual_);
void merge_halves(vector<string>&left, vector<string>& right, vector<string>& sort_visual_);
void draw(vector <string> &sort_visual_);
vector <string> sort_visual;
int main()
{
srand(time(NULL));
//vector
vector<int> num_list;
//fill vector with random integers
for (int i = 0; i < 40; i++)
num_list.push_back(rand() % 40);
//Fill the visualizer strings which will be bars with #'s
for (int i = 0; i < num_list.size(); i++)
{
sort_visual.push_back("");
string temp;
for (int j = 0; j < num_list.at(i); j++)
{
temp.push_back('#');
}
sort_visual.at(i) = temp;
}
draw(sort_visual);
system("pause");
//sort function
merge_sort(sort_visual);
}
void merge_sort(vector<string> &sort_visual_)
{
//dont do anything if the size of vector is 0 or 1.
if (sort_visual_.size() <= 1) return;
//middle of vector is size/2
int mid = sort_visual_.size() / 2;
//2 vectors created for left half and right half
vector<string> left;
vector<string> right;
//divided vectors
for (int j = 0; j < mid; j++)
{
left.push_back(sort_visual_[j]); //add all the elements from left side of original vector into the left vector
}
for (int j = 0; j < (sort_visual_.size()) - mid; j++)
{
right.push_back(sort_visual_[mid + j]);//add all the elements from right side of original vector into the right vector
}
//recursive function for dividing the left and right vectors until they are length of 1
merge_sort(left);
merge_sort(right);
//do the actual merging function
merge_halves(left, right, sort_visual_);
}
void merge_halves(vector<string>&left, vector<string>&right, vector<string>& sort_visual_) //pass in 3 vectors
{
// sizes of each vector (left and right)
int nL = left.size();
int nR = right.size();
//declaring variables pointint to elements for each vector. i will represent finished produce vector
int i = 0, j = 0, k = 0;
//as long as j and k are less than the left and right sizes
while (j < nL && k < nR)
{
if (left[j].length() < right[k].length()) //if the string in the left vector is smaller than string in right vector
{
sort_visual_[i] = left[j];//ad the string from left vector in the sort_visuals vector(which is the final product)
j++;//increment j to move on
}
else
{
sort_visual_[i] = right[k];//otherwise add the string from right vector in the sort_visual vector
k++; //increment k to move on
}
i++; //i is the final vector, and we have to increment it to set it up to take in the next number
system("CLS");
draw(sort_visual);
Sleep(15);
}
while (j < nL)
{
sort_visual_[i] = left[j];
j++; i++;
system("CLS");
draw(sort_visual);
Sleep(15);
}
while (k < nR)
{
sort_visual_[i] = right[k];
k++; i++;
system("CLS");
draw(sort_visual);
Sleep(15);
}
}
void draw(vector <string> &sort_visual)
{
for (int i = 0; i < sort_visual.size(); i++)
{
cout << sort_visual.at(i) << endl;
}
}
In merge_halves you work on sort_visual_ but draw sort_visual which is a global that does not seem to be changed. Make sure there are no globals and it will be harder to make mistakes.

How is this line returning the length of the array in recursion?

I am trying to understand this recursion using the debugger and trying to understand it step by step the main.The debugger shows the smallAns returns the size of the array I can't understand how this smallAns is returning the size of the array input[].can anyone explain this
#include<iostream>
using namespace std;
int subsequences(char input[], int startIndex,char output[][50]){
if(input[startIndex] == '\0'){
output[0][0] = '\0';
return 1;
}
int smallAns = subsequences(input, startIndex+1, output);
for(int i = smallAns; i < 2*smallAns; i++){
int row = i - smallAns;
output[i][0] = input[startIndex];
int j = 0;
for(; output[row][j] != '\0'; j++){
output[i][j + 1] = output[row][j];
}
output[i][j + 1] = '\0';
}
return 2*smallAns;
}
int main(){
char input[] = "abc";
char output[100][50];
int ans = subsequences(input, 0, output);
for(int i = 0; i < ans; i++){
for(int j = 0; output[i][j] != '\0'; j++){
cout << output[i][j];
}
cout << endl;
}
}
Here's what the algorithm is doing:
Start at the end, with the empty subsequence (or "\0"). You have 1 subsequence.
Look at the last character not yet considered. For all the subsequences you have found, you can either add this last character, or don't. Therefore you have doubled the number of subsequences.
Repeat.
Therefore, 2 * smallAns means "Take the number of subsequences found in the lower recursive call, and double it." And this makes sense after you know how it was implemented. Thus the importance of comments and documentation in code. :)

Quicksort c++ first element as pivot

I have something like this and I want to have first element as pivot.
Why this program is still does not working?
void algSzyb1(int tab[],int l,int p)
{
int x,w,i,j;
i=l; //l is left and p is pivot, //i, j = counter
j=p;
x=tab[l];
do
{
while(tab[i]<x) i++;
while(tab[j]>x) j--;
if(i<=j)
{
w=tab[i];
tab[i]=tab[j];
tab[j]=w;
i++;
j--;
}
}
while(!(i<j));
if(l<j) algSzyb1(tab,l,j);
if(i<p) algSzyb1(tab,i,p);
}
Looking at the code, not really checking what it does, just looking at the individual lines, this one line stands out:
while(!(i<j));
I look at that line, and I think: There is a bug somewhere round here. I haven't actually looked at the code so I don't know what the bug is, but I look at this single line and it looks wrong.
I think you need to decrement j before incrementing i.
while (tab[j]>x ) j--;
while (tab[i]<x && i < j) i++;
Also I have added an extra condition to ensure that i doesn't sweep past j. (Uninitialized memory read).
The pivot is slightly mis-named, as the end result is a sorted element, but this and the wikipedia page : quicksort both move the pivot into the higher partition, and don't guarantee the item in the correct place.
The end condition is when you have swept through the list
while( i < j ); /* not !(i<j) */
At the end of the search, you need to test a smaller set. The code you had created a stack overflow, because it repeatedly tried the same test.
if (l<j) algSzyb1(tab, l, j);
if (j+1<p) algSzyb1(tab, j+1, p);
Full code
void algSzyb1(int tab[], int l, int p)
{
int x, w, i, j;
i = l;
j = p;
x = tab[l]; //wróć tu później :D
do
{
while (tab[j]>x ) j--;
while (tab[i]<x && i < j) i++;
if (i < j)
{
w = tab[i];
tab[i] = tab[j];
tab[j] = w;
i++;
j--;
}
} while ((i<j));
if (l<j) algSzyb1(tab, l, j);
if (j+1<p) algSzyb1(tab, j+1, p);
}

Dynamic Arrays not properly initializing

I am writing a program that uses two dynamic arrays to sort the original array: One for the left side, and one for the right side.
However, the dynamic arrays do not receive the original array in lines 23 and 28 (As demonstrated by the cout methods throughout the array). They are either empty, or contain elements that are out of bounds. Thus, the program does not work as a whole. My question is then, is the issue with the initialization itself, or is it with the declaration on lines 18-19? I personally believe it is with the declaration, but I am unsure of what I should do with it, as with a Dynamic Array, I do not want to mess with the size too much. I included all of the methods for proper testing, but I will edit this question if they are deemed unnecessary. Thank you in advance for your help.
#include "stdafx.h"
#include <iostream>
using namespace std;
void Merge(int *array, int left, int middle, int right)
{
int * LArray;
int * RArray;
int counter = left;//This counter is used as a marker for the main array.
int mid = middle;
cout<<"Left " << left << "middle: "<< middle << " right: " << right<<endl;
LArray = new int[middle-left + 1];
RArray = new int[right];
/*Initializes LArray*/
for (int i = left; i < middle - left + 1; i++)
{
LArray[i] = array[i];
}
/*Initializes RArray*/
int temp = 0;
for (int i = middle; i < right; i++)
{
RArray[temp] = array[i];
temp++;
}
/*Prints out LArray*/
cout<<"LARRAY: ";
for (int i = left; i < middle- left + 1; i++)
{
cout<<LArray[i]<< " ";
}
/*Prints out RArray*/
cout<<endl<<"RARRY: ";
temp = 0;
for (int i = middle; i < right; i++)
{
temp = 0;
cout<<RArray[temp]<< " ";
temp++;
}
cout<<endl;
while (left <= middle && mid <= right)
{
/*This if statement checks if the number in the left array is smaller than the number in the right array*/
if (LArray[left] < RArray[right])
{
array[counter] = LArray[left];
left++;
counter++;
cout<<"First if: array[counter]: "<< array[counter]<<" LArray[left]" << LArray[left]<<" left: "<< left<<" counter : "<< counter<<endl;
}
/*This else statement checks if the number in the right array is smaller than the number in the left array*/
else
{
array[counter] = RArray[right];
mid++;
counter++;
cout<<" First else: array[counter]: "<< array[counter] << " RArray[right] "<< RArray[right]<<" mid: "<< mid<<" counter : "<< counter<<endl;
}
}
/*If RArray is completed, check this one for any remaining elements.*/
while (left <= middle)
{
array[counter] = LArray[left];
left++;
counter++;
cout<<" First while: array[counter]: "<< array[counter]<<" LArray[left]" << LArray[left]<<" left: "<< left<<" counter : "<< counter<<endl;
}
/*If LArray is completed, check this one for any remaining elements.*/
while (mid <= right)
{
array[counter] = RArray[right];
mid++;
counter++;
cout<<" Second while: array[counter]: "<< array[counter] << " RArray[right] "<< RArray[right]<<" mid: "<< mid<<" counter : "<< counter<<endl;
}
delete [] LArray;
delete [] RArray;
}
void MergeSort( int *array,int left, int right )
{
if ( left < right )
{
int middle = ( left + right ) / 2;
MergeSort( array, left, middle );
MergeSort( array, middle + 1, right );
Merge( array, left, middle, right );
}
};
/*Checks if the array listed is sorted by looping through and checking if the current number is smaller than the previous.*/
bool IsSorted(int* array, unsigned long long size)
{
for (int i = 0; i < size; i++)
{
cout<<array[i]<< " ";
}
cout<<endl;
for (int i = 1; i < size; i++)
{
if (array[i] < array[i-1])
return false;
}
return true;
}
int _tmain(int argc, _TCHAR* argv[])
{
int array[8] = {5, 2, 4, 7, 1, 3, 2, 6};
MergeSort(array, 0, 8);
bool check = IsSorted(array, 8);
if (check)
cout<<"It is sorted!";
else
cout<<"It is not sorted!";
return 0;
}
The idea of merge sort is that you recursively split an array into two halves, sort those two halves and then merge them back together.
Here are the things I see wrong with your code:
Code weirdness that aren't the problem with your algorithm:
Your left array is always larger than necessary, since you're allocating it to have middle elements and initializing it as such.
You initialize right array twice.
The actual problems with your algorithm:
You're not initializing counter correctly, it's always being set as 0 so no matter how deep you are in the callstack, you are always setting elements 0 through middle to be the sorted array instead of the range between left and right.
You're trying to index into the right array with argument right ...which will always be large than the number of elements in right array since you allocated it to have right - middle elements.
You also have an error in your MergeSort function. You are never actually sorting the middle element since in the Merge function array[right] is never added to RArray, so the first recursive call won't sort it, and you're passing in middle + 1 to the second recursive call so it's not sorting it.

What is wrong with my C++ merge sort program?

I'm stuck at an impass with this implementation. My n2 variable is being overwritten during the merging of the subarrays, what could be causing this? I have tried hard-coding values in but it does not seem to work.
#include <iostream>
#include <cstdlib>
#include <ctime> // For time(), time(0) returns the integer number of seconds from the system clock
#include <iomanip>
#include <algorithm>
#include <cmath>//added last nite 3/18/12 1:14am
using namespace std;
int size = 0;
void Merge(int A[], int p, int q, int r)
{
int i,
j,
k,
n1 = q - p + 1,
n2 = r - q;
int L[5], R[5];
for(i = 0; i < n1; i++)
L[i] = A[i];
for(j = 0; j < n2; j++)
R[j] = A[q + j + 1];
for(k = 0, i = 0, j = 0; i < n1 && j < n2; k++)//for(k = p,i = j = 1; k <= r; k++)
{
if(L[i] <= R[j])//if(L[i] <= R[j])
{
A[k] = L[i++];
} else {
A[k] = R[j++];
}
}
}
void Merge_Sort(int A[], int p, int r)
{
if(p < r)
{
int q = 0;
q = (p + r) / 2;
Merge_Sort(A, p, q);
Merge_Sort(A, q+1, r);
Merge(A, p, q, r);
}
}
void main()
{
int p = 1,
A[8];
for (int i = 0;i < 8;i++) {
A[i] = rand();
}
for(int l = 0;l < 8;l++)
{
cout<<A[l]<<" \n";
}
cout<<"Enter the amount you wish to absorb from host array\n\n";
cin>>size;
cout<<"\n";
int r = size; //new addition
Merge_Sort(A, p, size - 1);
for(int kl = 0;kl < size;kl++)
{
cout<<A[kl]<<" \n";
}
}
What tools are you using to compile the program? There are some flags which switch on checks for this sort of thing in e,.g. gcc (e.g. -fmudflap, I haven't used it, but it looks potehtially useful).
If you can use a debugger (e.g. gdb) you should be able to add a 'data watch' for the variable n2, and the debugger will stop the program whenever it detects anything writing into n2. That should help you track down the bug. Or try valgrind.
A simple technique to temporarily stop this type of bug is to put some dummy variables around the one getting trashed, so:
int dummy1[100];
int n2 = r - q;
int dummy2[100];
int L[5], R[5];
Variables being trashed are usually caused by code writing beyond the bounds of arrays.
The culprit is likely R[5] because that is likely the closest. You can look in the dummies to see what is being written, and may be able to deduce from that what is happening.
ANother option is to make all arrays huge, while you track down the problem. Again set values beyond the correct bounds to a known value, and check those values that should be unchanged.
You could make a little macro to do those checks, and drop it in at any convenient place.
I had used the similar Merge function earlier and it doesn't seem to work properly. Then I redesigned and now it works perfectly fine. Below is the redesigned function definition for merge function in C++.
void merge(int a[], int p, int q, int r){
int n1 = q-p+1; //no of elements in first half
int n2 = r-q; //no of elements in second half
int i, j, k;
int * b = new int[n1+n2]; //temporary array to store merged elements
i = p;
j = q+1;
k = 0;
while(i<(p+n1) && j < (q+1+n2)){ //merging the two sorted arrays into one
if( a[i] <= a[j]){
b[k++] = a[i++];
}
else
b[k++] = a[j++];
}
if(i >= (p+n1)) //checking first which sorted array is finished
while(k < (n1+n2)) //and then store the remaining element of other
b[k++] = a[j++]; //array at the end of merged array.
else
while(k < (n1+n2))
b[k++] = a[i++];
for(i = p,j=0;i<= r;){ //store the temporary merged array at appropriate
a[i++] = b[j++]; //location in main array.
}
delete [] b;
}
I hope it helps.
void Merge(int A[], int p, int q, int r)
{
int i,
j,
k,
n1 = q - p + 1,
n2 = r - q;
int L[5], R[5];
for(i = 0; i < n1; i++)
L[i] = A[i];
You only allocate L[5], but the n1 bound you're using is based on inputs q and p -- and the caller is allowed to call the function with values of q and p that allow writing outside the bounds of L[]. This can manifest itself as over-writing any other automatic variables, but because it is undefined behavior, just about anything could happen. (Including security vulnerabilities.)
I do not know what the best approach to fix this is -- I don't understand why you've got fixed-length buffers in Merge(), I haven't read closely enough to discover why -- but you should not access L[i] when i is greater than or equal to 5.
This entire conversation also holds for R[]. And, since *A is passed to Merge(), it'd make sense to ensure that your array accesses for it are also always in bound. (I haven't spotted them going out of bounds, but since this code needs re-working anyway, I'm not sure it's worth my looking for them carefully.)