binary search array overflow c++ - c++

I'm a Computer Science student. This is some code that I completed for my Data Structures and Algorithms class. It compiles fine, and runs correctly, but there is an error in it that I corrected with a band-aid. I'm hoping to get an answer as to how to fix it the right way, so that in the future, I know how to do this right.
The object of the assignment was to create a binary search. I took a program that I had created that used a heap sort and added a binary search. I used Visual Studio for my compiler.
My problem is that I chose to read in my values from a text file into an array. Each integer in the text file is separated by a tabbed space. In line 98, the file reads in correctly, but when I get to the last item in the file, the counter (n) counts one time too many, and assigns a large negative number (because of the array overflow) to that index in the array, which then causes my heap sort to start with a very large negative number that I don't need. I put a band-aid on this by assigning the last spot in the array the first spot in the array. I have compared the number read out to my file, and every number is there, but the large number is gone, so I know it works. This is not a suitable fix for me, even if the program does run correctly. I would like to know if anyone knows of a correct solution that would iterate through my file, assign each integer to a spot in the array, but not overflow the array.
Here is the entire program:
#include "stdafx.h"
#include <iostream>
#include <fstream>
using std::cout;
using std::cin;
using std::endl;
using std::ifstream;
#define MAXSIZE 100
void heapify(int heapList[], int i, int n) //i shows the index of array and n is the counter
{
int listSize;
listSize=n;
int j, temp;//j is a temporary index for array
temp = heapList[i];//temporary storage for an element of the array
j = 2 * i;//end of list
while (j <= listSize)
{
if (j < listSize && heapList[j + 1] > heapList[j])//if the value in the next spot is greater than the value in the current spot
j = j + 1;//moves value if greater than value beneath it
if (temp > heapList[j])//if the value in i in greater than the value in j
break;
else if (temp <= heapList[j])//if the value in i is less than the value in j
{
heapList[j / 2] = heapList[j];//assigns the value in j/2 to the current value in j--creates parent node
j = 2 * j;//recreates end of list
}
}
heapList[j / 2] = temp;//assigns to value in j/2 to i
return;
}
//This method is simply to iterate through the list of elements to heapify each one
void buildHeap(int heapList[], int n) {//n is the counter--total list size
int listSize;
listSize = n;
for (int i = listSize / 2; i >= 1; i--)//for loop to create heap
{
heapify(heapList, i, n);
}
}
//This sort function will take the values that have been made into a heap and arrange them in order so that they are least to greatest
void sort(int heapList[], int n)//heapsort
{
buildHeap(heapList, n);
for (int i = n; i >= 2; i--)//for loop to sort heap--i is >= 2 because the last two nodes will not have anything less than them
{
int temp = heapList[i];
heapList[i] = heapList[1];
heapList[1] = temp;
heapify(heapList, 1, i - 1);
}
}
//Binary search
void binarySearch(int heapList[], int first, int last) {//first=the beginning of the list, last=end of the list
int mid = first + last / 2;//to find middle for search
int searchKey;//number to search
cout << "Enter a number to search for: ";
cin >> searchKey;
while ((heapList[mid] != searchKey) && (first <= last)) {//while we still have a list to search through
if (searchKey < heapList[mid]) {
last = mid - 1;//shorten list by half
}
else {
first = mid + 1;//shorten list by half
}
mid = (first + last) / 2;//find new middle
}
if (first <= last) {//found number
cout << "Your number is " << mid << "th in line."<< endl;
}
else {//no number in list
cout << "Could not find the number.";
}
}
int main()
{
int j = 0;
int n = 0;//counter
int first = 0;
int key;//to prevent the program from closing
int heapList[MAXSIZE];//initialized heapList to the maximum size, currently 100
ifstream fin;
fin.open("Heapsort.txt");//in the same directory as the program
while (fin >> heapList[n]) {//read in
n++;
}
heapList[n] = heapList[0];
int last = n;
sort(heapList, n);
cout << "Sorted heapList" << endl;
for (int i = 1; i <= n; i++)//for loop for printing sorted heap
{
cout << heapList[i] << endl;
}
binarySearch(heapList, first, last);
cout << "Press Ctrl-N to exit." << endl;
cin >> key;
}

int heapList[MAXSIZE];//initialized heapList to the maximum size, currently 100
This comment is wrong - heapList array is declared not initialized, so when you had read all data from the file, index variable n will point to the uninitialized cell. Any attempt to use it will invoke an undefined behavior. You could either: initialize an array before using it, decrement n value, since it greater than read values number by one, or better use std::vector instead of array.

You populate values for heapsort for indices 0 to n-1 only.
Then you access heaplist from 1 to n which is out of bounds since no value was put in heapsort[n].
Use
for (int i = 0; i < n; i++) //instead of i=1 to n

Related

execution order for cout in C++

c++
When printing to console, if function execution is sequential it would seem logical the ordered array would be printed after calling insertionSort, however order list does not print until next loop. Any help would be appreciated.
#include <stdio.h>
#include <iostream>
#include <array>
using namespace std;
void insertionSort(int* array, int size) {
for (int i = 1; i < size; i++) {
int key = i - 1;
while (i > 0 && array[key] > array[i] ) {
int tmp = array[i];
array[i] = array[key];
array[key] = tmp;
i -= 1;
key -= 1;
}
}
}
const int ARRAY_MAXSIZE = 5;
int main(void) {
int *array = (int*)calloc(ARRAY_MAXSIZE, sizeof(int));
int input;
cout << "Enter 5 digits\n";
for (int size=0; size < ARRAY_MAXSIZE; size++) {
cout << size << " index ";
cin >> input;
array[size] = input;
insertionSort(array, size);
for (int j=0; j <= size; j++) {
cout << array[j];
}
cout << '\n';
}
}
Console Entry
This is a classic off-by-one error. Your insertionSort expects you to pass the number of elements to sort via the parameter size. But your main loop is always holding a value that is one less than the size immediately after adding an element.
I want to say that bugs like this are easily discovered by stepping through your program's execution with a debugger. If you don't know how to use a debugger, start learning now. It is one of the most important tools used by developers.
Anyway, the quick fix is to change your function call to:
insertionSort(array, size + 1);
However, as Paul McKenzie pointed out in comments, it's a bit crazy to do this every time you add a new element because your function sorts an entire unsorted array. Your array is always nearly sorted except for the last element. You only need to call that function once after your input loop is done:
// Read unsorted data
for (int size = 0; size < ARRAY_MAXSIZE; size++) {
cout << size << " index ";
cin >> input;
array[size] = input;
}
// Sort everything
insertionSort(array, ARRAY_MAXSIZE);
// Output
for (int j = 0; j < ARRAY_MAXSIZE; j++) {
cout << array[j];
}
cout << '\n';
But if you want every insertion to result in a sorted array, you can "slide" each new value into place after inserting it. It's similar to a single iteration of your insertion-sort:
// Sort the last element into the correct position
for (int i = size; i >= 1 && array[i] > array[i - 1]; i--)
{
std::swap(array[i], array[i - 1]);
}
Even better, you don't need to swap all those values. You simply read the value, then shuffle the array contents over to make room, then stick it in the right spot:
// Read next value
cin >> input;
// Shuffle elements to make room for new value
int newPos = size;
while (newPos > 0 && array[newPos - 1] > input) {
array[newPos] - array[newPos - 1];
newPos--;
}
// Add the new value
array[newPos] = input;

Issues with checking an array moving both forwards and backwards simultaneously and issue printing values stored in a pointer array

Preface: Currently reteaching myself C++ so please excuse some of my ignorance.
The challenge I was given was to write a program to search through a static array with a function and return the indices of the number you were searching for. This only required 1 function and minimal effort so I decided to make it more "complicated" to practice more of the things I have learned thus far. I succeeded for the most part, but I'm having issues with my if statements within my for loop. I want them to check 2 separate spots within the array passed to it, but it is checking the same indices for both of them. I also cannot seem to get the indices as an output. I can get the correct number of memory locations, but not the correct values. My code is somewhat cluttered and I understand there are more efficient ways to do this. I would love to be shown these ways as well, but I would also like to understand where my error is and how to fix it. Also, I know 5 won't always be present within the array since I'm using a pseudo random number generator.
Thank you in advance.
#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;
// This is supposed to walk throught the array both backwards and forwards checking for the value entered and
// incrementing the count so you know the size of the array you need to create in the next function.
int test(int A[], int size, int number) {
int count = 0;
for (int i = 0; i <= size; i++, size--)
{
if (A[i] == number)
count++;
// Does not walk backwards through the array. Why?
if (A[size] == number)
count++;
}
cout << "Count is: " << count << endl;
return (count);
}
// This is a linear search that creates a pointer array from the previous "count" variable in function test.
// It should store the indices of the value you are searching for in this newly created array.
int * search(int A[], int size, int number, int arr_size){
int *p = new int[arr_size];
int count =0;
for(int i = 0; i < size; i++){
if(A[i]==number) {
p[count] = i;
}
count++;
}
return p;
}
int main(){
// Initializing the array to zero just to be safe
int arr[99]={0},x;
srand(time(0));
// Populating the array with random numbers in between 1-100
for (int i = 0; i < 100; i++)
arr[i]= (rand()%100 + 1);
// Was using this to check if the variable was actually in the array.
// for(int x : arr)
// cout << x << " ";
// Selecting the number you wish to search for.
// cout << "Enter the number you wish to search for between 1 and 100: ";
// cin >> x;
// Just using 5 as a test case.
x = 5;
// This returns the number of instances it finds the number you're looking for
int count = test(arr, (sizeof(arr)/4), x);
// If your count returns 0 that means the number wasn't found so no need to continue.
if(count == 0){
cout << "Your number was not found " << endl;
return 0;
}
// This should return the address array created in the function "search"
int *index = search(arr, (sizeof(arr)/4), x, count);
// This should increment through the array which address you assigned to index.
for(int i=0; i < count; i++) {
// I can get the correct number of addresses based on count, just not the indices themselves.
cout << &index[i] << " " << endl;
}
return 0;
}
I deeply appreciate your help and patience as well as I want to thank you again for your help.

Why is an exception being thrown in this dynamic array?

I am having trouble understanding why this exception is being thrown. I allocated an array to receive 100 int values and want to store all odd numbers under 200 into the array (which should be 100 integer values). I'm trying to understand why my code is not working.
I have called my function to allocate an array of 100 int values. After, I created a for-loop to iterate through and store integers into the array however I created an if statement to only store odd numbers. What I can't understand is if I put my counter to 200 and use the if statement an exception is thrown, but if I don't insert the if statement and only put my counter to 100 all numbers between 1-100 stored and an exception won't be thrown.
The only thing I can think of that's causing this is when my counter is at 200 and I have the if statement to catch all odd number, somehow all numbers under 200 are being stored in the array causing the exception to be thrown.
int *allocIntArray(int);
int main() {
int *a;
a = allocIntArray(100);
for (int count = 1; count < 200; count++) {
if (a[count] % 2 == 1) {
a[count] = count;
cout << a[count] << endl;
}
}
delete[] a;
return 0;
}
int *allocIntArray(int size) {
int *newarray = new int[size]();
return newarray;
}
When I look at the program output, it only displays the odd numbers yet the exception is being thrown. That tells me my if statement is working yet something is being muddied up.
What am I missing?
Thanks for your time and knowledge.
Cause of the error
If you have an array a that was created with n elements, it is undefined behavior when trying to access an array element out of bouds. So the index MUST always be between 0 and n-1.
So the behavior of your program is undefined as soon as count is 100, since evaluating the condition in the if-clause already tries to access out of bounds.
Adjustment that does what you want
Now in addition, there is a serious bug in your program logic: If you want to add numbers that satisfy some kind of condition, you need 2 counters: one for iterating on the numbers, and one for the last index used in the array:
for (int nextitem=0, count = 1; count < 200; count++) {
if (count % 2 == 1) { // not a[count], you need to test number itself
a[nextitem++] = count;
cout << count << endl;
if (nextitem == 100) { // attention: hard numbers should be avoided
cout << "Array full: " << nextitem << " items reached at " << count <<endl;
break; // exit the for loop
}
}
}
But, this solution requires you to keep track of the last item in the array, and the size of the array (it's hard-coded here).
Vectors
You are probably learning. But in C++ a better solution would be to use vector instead of an array, and use push_back(). Vectors manage the memory, so that you can focus on your algorithm. The full program would then look like:
vector<int> a;
for (int count = 1; count < 200; count++) {
if (count % 2 == 1) {
a.push_back(count);
cout << count << endl;
}
}
cout << "Added " << a.size() << " elements" <<endl;
cout << "10th element: "<< a[9] << endl;
The problem is not how many numbers you're storing but where you're storing them; you're storing 101 in a[101], which is obviously wrong.
If the i:th odd number is C, the correct index is i-1, not C.
The most readable change is probably to introduce a new counter variable.
int main() {
int a[100] = {0};
int count = 0;
for (int number = 1; number < 200; number++) {
if (number % 2 == 1) {
a[count] = number;
count += 1;
}
}
}
I think transforming this from a search problem to a generation problem makes it easier to get right.
If you happen to remember that every odd number C can be written on the form 2 * A + 1for some A, you' will see that the sequence you're looking for is
2*0+1, 2*1+1, 2*2+1, ..., 2*99+1
so
int main()
{
int numbers[100] = {0};
for (int i = 0; i < 100; i++)
{
numbers[i] = 2 * i + 1;
}
}
You can also go the other way around, looping over the odd numbers and storing them in the right place:
int main()
{
int numbers[100] = {0};
for (int i = 1; i < 200; i += 2) // This loops over the odd numbers.
{
numbers[i/2] = i; // Integer division makes this work.
}
}

Incorrect output for the second smallest integer

I'm having trouble finding the second smallest integer in my array. The array is unsorted (it's what's in the data.txt file), so I know that might be part of the problem, I'm not sure how to fix this in the simplest way. Afterwards I have to remove that integer from the array, move every number over and reprint the array, if anyone could help I'd really appreciate it.
const NUM = 10;
int Array[NUM];
ifstream infile;
infile.open("Data.txt");
for (int i = 0; i < NUM; i++)
{
infile >> Array[i];
cout << Array[i] << endl;
}
int Min = Array[0];
int Next = 0, SecondMin = 0;
for (int k = 0; k < NUM; k++)
{
if (Min > Array[k])
Min = Array[k];
}
for (int m = 2; m < NUM; m++)
{
Next = Array[m];
if (Next > Min)
{
SecondMin = Min;
Min = Next;
}
else if (Next < SecondMin)
{
SecondMin = Next;
}
}
cout << "The second smallest integer is: " << SecondMin << endl;
You don't have to loop over the array twice to find the second smallest number. As long as you're keeping track of both the smallest and the second smallest, you should be able to find them both with a single loop.
There are a couple of other problems with this code:
Your check for end of file should probably be if (!infile.eof())
You don't need to check if (i < NUM) inside your loop. i will always be less than NUM due to the constraint on the loop.
If for some reason the number of items in the file is less than NUM, the rest of the items in the array will have undefined values. For instance, if there were only nine items in the file, after reading the file, Array[9] would have whatever value happened to be in that spot in memory when the array was created. This could cause problems with your algorithm.
I assume that this is some sort of homework problem, which is why the use of an array is required. But keep in mind for the future that you'd probably want to use a std::vector in this sort of situation. That way you could just keep reading numbers from the file and adding them to the vector until you reached the end, rather than having a fixed number of inputs, and all of the values in the vector would be valid.

Radix Sort implemented in C++

I am trying to improve my C++ by creating a program that will take a large amount of numbers between 1 and 10^6. The buckets that will store the numbers in each pass is an array of nodes (where node is a struct I created containing a value and a next node attribute).
After sorting the numbers into buckets according to the least significant value, I have the end of one bucket point to the beginning of another bucket (so that I can quickly get the numbers being stored without disrupting the order). My code has no errors (either compile or runtime), but I've hit a wall regarding how I am going to solve the remaining 6 iterations (since I know the range of numbers).
The problem that I'm having is that initially the numbers were supplied to the radixSort function in the form of a int array. After the first iteration of the sorting, the numbers are now stored in the array of structs. Is there any way that I could rework my code so that I have just one for loop for the 7 iterations, or will I need one for loop that will run once, and another loop below it that will run 6 times before returning the completely sorted list?
#include <iostream>
#include <math.h>
using namespace std;
struct node
{
int value;
node *next;
};
//The 10 buckets to store the intermediary results of every sort
node *bucket[10];
//This serves as the array of pointers to the front of every linked list
node *ptr[10];
//This serves as the array of pointer to the end of every linked list
node *end[10];
node *linkedpointer;
node *item;
node *temp;
void append(int value, int n)
{
node *temp;
item=new node;
item->value=value;
item->next=NULL;
end[n]=item;
if(bucket[n]->next==NULL)
{
cout << "Bucket " << n << " is empty" <<endl;
bucket[n]->next=item;
ptr[n]=item;
}
else
{
cout << "Bucket " << n << " is not empty" <<endl;
temp=bucket[n];
while(temp->next!=NULL){
temp=temp->next;
}
temp->next=item;
}
}
bool isBucketEmpty(int n){
if(bucket[n]->next!=NULL)
return false;
else
return true;
}
//print the contents of all buckets in order
void printBucket(){
temp=bucket[0]->next;
int i=0;
while(i<10){
if(temp==NULL){
i++;
temp=bucket[i]->next;
}
else break;
}
linkedpointer=temp;
while(temp!=NULL){
cout << temp->value <<endl;
temp=temp->next;
}
}
void radixSort(int *list, int length){
int i,j,k,l;
int x;
for(i=0;i<10;i++){
bucket[i]=new node;
ptr[i]=new node;
ptr[i]->next=NULL;
end[i]=new node;
}
linkedpointer=new node;
//Perform radix sort
for(i=0;i<1;i++){
for(j=0;j<length;j++){
x=(int)(*(list+j)/pow(10,i))%10;
append(*(list+j),x);
printBucket(x);
}//End of insertion loop
k=0,l=1;
//Linking loop: Link end of one linked list to the front of another
for(j=0;j<9;j++){
if(isBucketEmpty(k))
k++;
if(isBucketEmpty(l) && l!=9)
l++;
if(!isBucketEmpty(k) && !isBucketEmpty(l)){
end[k]->next=ptr[l];
k++;
if(l!=9) l++;
}
}//End of linking for loop
cout << "Print results" <<endl;
printBucket();
for(j=0;j<10;j++)
bucket[i]->next=NULL;
cout << "End of iteration" <<endl;
}//End of radix sort loop
}
int main(){
int testcases,i,input;
cin >> testcases;
int list[testcases];
int *ptr=&list[0];
for(i=0;i<testcases;i++){
cin>>list[i];
}
radixSort(ptr,testcases);
return 0;
}
I think you're severely overcomplicating your solution. You can implement radix using the single array received in the input, with the buckets in each step represented by an array of indices that mark the starting index of each bucket in the input array.
In fact, you could even do it recursively:
// Sort 'size' number of integers starting at 'input' according to the 'digit'th digit
// For the parameter 'digit', 0 denotes the least significant digit and increases as significance does
void radixSort(int* input, int size, int digit)
{
if (size == 0)
return;
int[10] buckets; // assuming decimal numbers
// Sort the array in place while keeping track of bucket starting indices.
// If bucket[i] is meant to be empty (no numbers with i at the specified digit),
// then let bucket[i+1] = bucket[i]
for (int i = 0; i < 10; ++i)
{
radixSort(input + buckets[i], buckets[i+1] - buckets[i], digit+1);
}
}
Of course buckets[i+1] - buckets[i] will cause a buffer overflow when i is 9, but I omitted the extra check or readability's sake; I trust you know how to handle that.
With that, you just have to call radixSort(testcases, sizeof(testcases) / sizeof(testcases[0]), 0) and your array should be sorted.
To speed up the process with better memory management, create a matrix for the counts that get converted into indices by making a single pass over the array. Allocate a second temp array the same size as the original array, and radix sort between the two arrays until the array is sorted. If an odd number of radix sort passes is performed, then the temp array will need to be copied back to the original array at the end.
To further speed up the process, use base 256 instead of base 10 for the radix sort. This only takes 1 scan pass to create the matrix and 4 radix sort passes to do the sort. Example code:
typedef unsigned int uint32_t;
uint32_t * RadixSort(uint32_t * a, size_t count)
{
size_t mIndex[4][256] = {0}; // count / index matrix
uint32_t * b = new uint32_t [COUNT]; // allocate temp array
size_t i,j,m,n;
uint32_t u;
for(i = 0; i < count; i++){ // generate histograms
u = a[i];
for(j = 0; j < 4; j++){
mIndex[j][(size_t)(u & 0xff)]++;
u >>= 8;
}
}
for(j = 0; j < 4; j++){ // convert to indices
m = 0;
for(i = 0; i < 256; i++){
n = mIndex[j][i];
mIndex[j][i] = m;
m += n;
}
}
for(j = 0; j < 4; j++){ // radix sort
for(i = 0; i < count; i++){ // sort by current lsb
u = a[i];
m = (size_t)(u>>(j<<3))&0xff;
b[mIndex[j][m]++] = u;
}
std::swap(a, b); // swap ptrs
}
delete[] b;
return(a);
}
Since your values are ints in the range of 0 ... 1,000,000
You can create a int array of of size 1,000,001, and do the whole thing in two passes
Init the second array to all zeros.
Make a pass through your input array, and use the value as a subscript
to increment the value in the second array.
Once you do that then the second pass is easy.
walk through the second array, and each element tells you how many times that
number appeared in the original array. Use that information to repopulate
your input array.