Why is linear search so much faster than binary search? - c++

Consider the following code to find a peak in an array.
#include<iostream>
#include<chrono>
#include<unistd.h>
using namespace std;
//Linear search solution
int peak(int *A, int len)
{
if(A[0] >= A[1])
return 0;
if(A[len-1] >= A[len-2])
return len-1;
for(int i=1; i < len-1; i=i+1) {
if(A[i] >= A[i-1] && A[i] >= A[i+1])
return i;
}
return -1;
}
int mean(int l, int r) {
return l-1 + (r-l)/2;
}
//Recursive binary search solution
int peak_rec(int *A, int l, int r)
{
// cout << "Called with: " << l << ", " << r << endl;
if(r == l)
return l;
if(r == l+ 1)
return (A[l] >= A[l+1])?l:l+1;
int m = mean(l, r);
if(A[m] >= A[m-1] && A[m] >= A[m+1])
return m;
if(A[m-1] >= A[m])
return peak_rec(A, l, m-1);
else
return peak_rec(A, m+1, r);
}
int main(int argc, char * argv[]) {
int size = 100000000;
int *A = new int[size];
for(int l=0; l < size; l++)
A[l] = l;
chrono::steady_clock::time_point start = chrono::steady_clock::now();
int p = -1;
for(int k=0; k <= size; k ++)
// p = peak(A, size);
p = peak_rec(A, 0, size-1);
chrono::steady_clock::time_point end = chrono::steady_clock::now();
chrono::duration<double> time_span = chrono::duration_cast<chrono::duration<double>>(end - start);
cout << "Peak finding: " << p << ", time in secs: " << time_span.count() << endl;
delete[] A;
return 0;
}
If I compile with -O3 and use the linear search solution (the peak function) it takes:
0.049 seconds
If I use the binary search solution which should be much faster (the peak_rec function), it takes:
5.27 seconds
I tried turning off optimization but this didn't change the situation. I also tried both gcc and clang.
What is going on?

What is going on is that you've tested it in one case of a strictly monotonically increasing function. Your linear search routine has a shortcut that checks the final two entries, so it never even does a linear search. You should test random arrays to get a true sense of the distribution of runtimes.

That happens because your linear search solution has an optimization for sorted arrays as the one you are passing into it. if(A[len-1] >= A[len-2]) will return your function before even approaching to enter the search loop when your array is sorted uprising so the complexity there is constant for rising sorted arrays. Your binary search however, does a full search for the array and thus takes much longer. The solution would be to fill your array randomly. You can achieve this by using a random number generator:
int main() {
std::random_device rd; /* Create a random device to seed our twisted mersenne generator */
std::mt19937 gen(rd()); /* create a generator with a random seed */
std::uniform_int_distribution<> range(0, 100000000); /* specify a range for the random values (choose whatever you want)*/
int size = 100000000;
int *A = new int[size];
for(int l=0; l < size; l++)
A[l] = range(gen); /* fill the array with random values in the range of 0 - 100000000
[ . . . ]
EDIT:
One very important thing when you fill your array randomly: your function will not work with unsorted arrays since if the first element is grater than the second or the last one is greater than the previous, the function returns even if there was a value inbetween which is much greater. So remove those lines if you expect unsorted arrays (which you should since a search a peak element is always constant complexity for sorted arrays and there is no point in searching one)

Related

binary search array overflow c++

I'm a Computer Science student. This is some code that I completed for my Data Structures and Algorithms class. It compiles fine, and runs correctly, but there is an error in it that I corrected with a band-aid. I'm hoping to get an answer as to how to fix it the right way, so that in the future, I know how to do this right.
The object of the assignment was to create a binary search. I took a program that I had created that used a heap sort and added a binary search. I used Visual Studio for my compiler.
My problem is that I chose to read in my values from a text file into an array. Each integer in the text file is separated by a tabbed space. In line 98, the file reads in correctly, but when I get to the last item in the file, the counter (n) counts one time too many, and assigns a large negative number (because of the array overflow) to that index in the array, which then causes my heap sort to start with a very large negative number that I don't need. I put a band-aid on this by assigning the last spot in the array the first spot in the array. I have compared the number read out to my file, and every number is there, but the large number is gone, so I know it works. This is not a suitable fix for me, even if the program does run correctly. I would like to know if anyone knows of a correct solution that would iterate through my file, assign each integer to a spot in the array, but not overflow the array.
Here is the entire program:
#include "stdafx.h"
#include <iostream>
#include <fstream>
using std::cout;
using std::cin;
using std::endl;
using std::ifstream;
#define MAXSIZE 100
void heapify(int heapList[], int i, int n) //i shows the index of array and n is the counter
{
int listSize;
listSize=n;
int j, temp;//j is a temporary index for array
temp = heapList[i];//temporary storage for an element of the array
j = 2 * i;//end of list
while (j <= listSize)
{
if (j < listSize && heapList[j + 1] > heapList[j])//if the value in the next spot is greater than the value in the current spot
j = j + 1;//moves value if greater than value beneath it
if (temp > heapList[j])//if the value in i in greater than the value in j
break;
else if (temp <= heapList[j])//if the value in i is less than the value in j
{
heapList[j / 2] = heapList[j];//assigns the value in j/2 to the current value in j--creates parent node
j = 2 * j;//recreates end of list
}
}
heapList[j / 2] = temp;//assigns to value in j/2 to i
return;
}
//This method is simply to iterate through the list of elements to heapify each one
void buildHeap(int heapList[], int n) {//n is the counter--total list size
int listSize;
listSize = n;
for (int i = listSize / 2; i >= 1; i--)//for loop to create heap
{
heapify(heapList, i, n);
}
}
//This sort function will take the values that have been made into a heap and arrange them in order so that they are least to greatest
void sort(int heapList[], int n)//heapsort
{
buildHeap(heapList, n);
for (int i = n; i >= 2; i--)//for loop to sort heap--i is >= 2 because the last two nodes will not have anything less than them
{
int temp = heapList[i];
heapList[i] = heapList[1];
heapList[1] = temp;
heapify(heapList, 1, i - 1);
}
}
//Binary search
void binarySearch(int heapList[], int first, int last) {//first=the beginning of the list, last=end of the list
int mid = first + last / 2;//to find middle for search
int searchKey;//number to search
cout << "Enter a number to search for: ";
cin >> searchKey;
while ((heapList[mid] != searchKey) && (first <= last)) {//while we still have a list to search through
if (searchKey < heapList[mid]) {
last = mid - 1;//shorten list by half
}
else {
first = mid + 1;//shorten list by half
}
mid = (first + last) / 2;//find new middle
}
if (first <= last) {//found number
cout << "Your number is " << mid << "th in line."<< endl;
}
else {//no number in list
cout << "Could not find the number.";
}
}
int main()
{
int j = 0;
int n = 0;//counter
int first = 0;
int key;//to prevent the program from closing
int heapList[MAXSIZE];//initialized heapList to the maximum size, currently 100
ifstream fin;
fin.open("Heapsort.txt");//in the same directory as the program
while (fin >> heapList[n]) {//read in
n++;
}
heapList[n] = heapList[0];
int last = n;
sort(heapList, n);
cout << "Sorted heapList" << endl;
for (int i = 1; i <= n; i++)//for loop for printing sorted heap
{
cout << heapList[i] << endl;
}
binarySearch(heapList, first, last);
cout << "Press Ctrl-N to exit." << endl;
cin >> key;
}
int heapList[MAXSIZE];//initialized heapList to the maximum size, currently 100
This comment is wrong - heapList array is declared not initialized, so when you had read all data from the file, index variable n will point to the uninitialized cell. Any attempt to use it will invoke an undefined behavior. You could either: initialize an array before using it, decrement n value, since it greater than read values number by one, or better use std::vector instead of array.
You populate values for heapsort for indices 0 to n-1 only.
Then you access heaplist from 1 to n which is out of bounds since no value was put in heapsort[n].
Use
for (int i = 0; i < n; i++) //instead of i=1 to n

Count number of ways for choosing two numbers in efficient algorithm

I solved this problem but I got TLE Time Limit Exceed on online judge
the output of program is right but i think the way can be improved to be more efficient!
the problem :
Given n integer numbers, count the number of ways in which we can choose two elements such
that their absolute difference is less than 32.
In a more formal way, count the number of pairs (i, j) (1 ≤ i < j ≤ n) such that
|V[i] - V[j]| < 32. |X|
is the absolute value of X.
Input
The first line of input contains one integer T, the number of test cases (1 ≤ T ≤ 128).
Each test case begins with an integer n (1 ≤ n ≤ 10,000).
The next line contains n integers (1 ≤ V[i] ≤ 10,000).
Output
For each test case, print the number of pairs on a single line.
my code in c++ :
int main() {
int T,n,i,j,k,count;
int a[10000];
cin>>T;
for(k=0;k<T;k++)
{ count=0;
cin>>n;
for(i=0;i<n;i++)
{
cin>>a[i];
}
for(i=0;i<n;i++)
{
for(j=i;j<n;j++)
{
if(i!=j)
{
if(abs(a[i]-a[j])<32)
count++;
}
}
}
cout<<count<<endl;
}
return 0;
}
I need help how can I solve it in more efficient algorithm ?
Despite my previous (silly) answer, there is no need to sort the data at all. Instead you should count the frequencies of the numbers.
Then all you need to do is keep track of the number of viable numbers to pair with, while iterating over the possible values. Sorry no c++ but java should be readable as well:
int solve (int[] numbers) {
int[] frequencies = new int[10001];
for (int i : numbers) frequencies[i]++;
int solution = 0;
int inRange = 0;
for (int i = 0; i < frequencies.length; i++) {
if (i > 32) inRange -= frequencies[i - 32];
solution += frequencies[i] * inRange;
solution += frequencies[i] * (frequencies[i] - 1) / 2;
inRange += frequencies[i];
}
return solution;
}
#include <bits/stdc++.h>
using namespace std;
int a[10010];
int N;
int search (int x){
int low = 0;
int high = N;
while (low < high)
{
int mid = (low+high)/2;
if (a[mid] >= x) high = mid;
else low = mid+1;
}
return low;
}
int main() {
cin >> N;
for (int i=0 ; i<N ; i++) cin >> a[i];
sort(a,a+N);
long long ans = 0;
for (int i=0 ; i<N ; i++)
{
int t = search(a[i]+32);
ans += (t -i - 1);
}
cout << ans << endl;
return 0;
}
You can sort the numbers, and then use a sliding window. Starting with the smallest number, populate a std::deque with the numbers so long as they are no larger than the smallest number + 31. Then in an outer loop for each number, update the sliding window and add the new size of the sliding window to the counter. Update of the sliding window can be performed in an inner loop, by first pop_front every number that is smaller than the current number of the outer loop, then push_back every number that is not larger than the current number of the outer loop + 31.
One faster solution would be to first sort the array, then iterate through the sorted array and for each element only visit the elements to the right of it until the difference exceeds 31.
Sorting can probably be done via count sort (since you have 1 ≤ V[i] ≤ 10,000). So you get linear time for the sorting part. It might not be necessary though (maybe quicksort suffices in order to get all the points).
Also, you can do a trick for the inner loop (the "going to the right of the current element" part). Keep in mind that if S[i+k]-S[i]<32, then S[i+k]-S[i+1]<32, where S is the sorted version of V. With this trick the whole algorithm turns linear.
This can be done constant number of passes over the data, and actually can be done without being affected by the value of the "interval" (in your case, 32).
This is done by populating an array where a[i] = a[i-1] + number_of_times_i_appears_in_the_data - informally, a[i] holds the total number of elements that are smaller/equals to i.
Code (for a single test case):
static int UPPER_LIMIT = 10001;
static int K = 32;
int frequencies[UPPER_LIMIT] = {0}; // O(U)
int n;
std::cin >> n;
for (int i = 0; i < n; i++) { // O(n)
int x;
std::cin >> x;
frequencies[x] += 1;
}
for (int i = 1; i < UPPER_LIMIT; i++) { // O(U)
frequencies[i] += frequencies[i-1];
}
int count = 0;
for (int i = 1; i < UPPER_LIMIT; i++) { // O(U)
int low_idx = std::max(i-32, 0);
int number_of_elements_with_value_i = frequencies[i] - frequencies[i-1];
if (number_of_elements_with_value_i == 0) continue;
int number_of_elements_with_value_K_close_to_i =
(frequencies[i-1] - frequencies[low_idx]);
std::cout << "i: " << i << " number_of_elements_with_value_i: " << number_of_elements_with_value_i << " number_of_elements_with_value_K_close_to_i: " << number_of_elements_with_value_K_close_to_i << std::endl;
count += number_of_elements_with_value_i * number_of_elements_with_value_K_close_to_i;
// Finally, add "duplicates" of i, this is basically sum of arithmetic
// progression with d=1, a0=0, n=number_of_elements_with_value_i
count += number_of_elements_with_value_i * (number_of_elements_with_value_i-1) /2;
}
std::cout << count;
Working full example on IDEone.
You can sort and then use break to end loop when ever the range goes out.
int main()
{
int t;
cin>>t;
while(t--){
int n,c=0;
cin>>n;
int ar[n];
for(int i=0;i<n;i++)
cin>>ar[i];
sort(ar,ar+n);
for(int i=0;i<n;i++){
for(int j=i+1;j<n;j++){
if(ar[j]-ar[i] < 32)
c++;
else
break;
}
}
cout<<c<<endl;
}
}
Or, you can use a hash array for the range and mark occurrence of each element and then loop around and check for each element i.e. if x = 32 - y is present or not.
A good approach here is to split the numbers into separate buckets:
constexpr int limit = 10000;
constexpr int diff = 32;
constexpr int bucket_num = (limit/diff)+1;
std::array<std::vector<int>,bucket_num> buckets;
cin>>n;
int number;
for(i=0;i<n;i++)
{
cin >> number;
buckets[number/diff].push_back(number%diff);
}
Obviously the numbers that are in the same bucket are close enough to each other to fit the requirement, so we can just count all the pairs:
int result = std::accumulate(buckets.begin(), buckets.end(), 0,
[](int s, vector<int>& v){ return s + (v.size()*(v.size()-1))/2; });
The numbers that are in non-adjacent buckets cannot form any acceptable pairs, so we can just ignore them.
This leaves the last corner case - adjacent buckets - which can be solved in many ways:
for(int i=0;i<bucket_num-1;i++)
if(buckets[i].size() && buckets[i+1].size())
result += adjacent_buckets(buckets[i], buckets[i+1]);
Personally I like the "occurrence frequency" approach on the one bucket scale, but there may be better options:
int adjacent_buckets(const vector<int>& bucket1, const vector<int>& bucket2)
{
std::array<int,diff> pairs{};
for(int number : bucket1)
{
for(int i=0;i<number;i++)
pairs[i]++;
}
return std::accumulate(bucket2.begin(), bucket2.end(), 0,
[&pairs](int s, int n){ return s + pairs[n]; });
}
This function first builds an array of "numbers from lower bucket that are close enough to i", and then sums the values from that array corresponding to the upper bucket numbers.
In general this approach has O(N) complexity, in the best case it will require pretty much only one pass, and overall should be fast enough.
Working Ideone example
This solution can be considered O(N) to process N input numbers and constant in time to process the input:
#include <iostream>
using namespace std;
void solve()
{
int a[10001] = {0}, N, n, X32 = 0, ret = 0;
cin >> N;
for (int i=0; i<N; ++i)
{
cin >> n;
a[n]++;
}
for (int i=0; i<10001; ++i)
{
if (i >= 32)
X32 -= a[i-32];
if (a[i])
{
ret += a[i] * X32;
ret += a[i] * (a[i]-1)/2;
X32 += a[i];
}
}
cout << ret << endl;
}
int main()
{
int T;
cin >> T;
for (int i=0 ; i<T ; i++)
solve();
}
run this code on ideone
Solution explanation: a[i] represents how many times i was in the input series.
Then you go over entire array and X32 keeps track of number of elements that's withing range from i. The only tricky part really is to calculate properly when some i is repeated multiple times: a[i] * (a[i]-1)/2. That's it.
You should start by sorting the input.
Then if your inner loop detects the distance grows above 32, you can break from it.
Thanks for everyone efforts and time to solve this problem.
I appreciated all Attempts to solve it.
After testing the answers on online judge I found the right and most efficient solution algorithm is Stef's Answer and AbdullahAhmedAbdelmonem's answer also pavel solution is right but it's exactly same as Stef solution in different language C++.
Stef's code got time execution 358 ms in codeforces online judge and accepted.
also AbdullahAhmedAbdelmonem's code got time execution 421 ms in codeforces online judge and accepted.
if they put detailed explanation to there algorithm the bounty will be to one of them.
you can try your solution and submit it to codeforces online judge at this link after choosing problem E. Time Limit Exceeded?
also I found a great algorithm solution and more understandable using frequency array and it's complexity O(n).
in this algorithm you only need to take specific range for each inserted element to the array which is:
begin = element - 32
end = element + 32
and then count number of pair in this range for each inserted element in the frequency array :
int main() {
int T,n,i,j,k,b,e,count;
int v[10000];
int freq[10001];
cin>>T;
for(k=0;k<T;k++)
{
count=0;
cin>>n;
for(i=1;i<=10000;i++)
{
freq[i]=0;
}
for(i=0;i<n;i++)
{
cin>>v[i];
}
for(i=0;i<n;i++)
{
count=count+freq[v[i]];
b=v[i]-31;
e=v[i]+31;
if(b<=0)
b=1;
if(e>10000)
e=10000;
for(j=b;j<=e;j++)
{
freq[j]++;
}
}
cout<<count<<endl;
}
return 0;
}
finally i think the best approach to solve this kind of problems to use frequency array and count number of pairs in specific range because it's time complexity is O(n).

C++ algorithm optimization: find K combination from N elements

I am pretty noobie with C++ and am trying to do some HackerRank challenges as a way to work on that.
Right now I am trying to solve Angry Children problem: https://www.hackerrank.com/challenges/angry-children
Basically, it asks to create a program that given a set of N integer, finds the smallest possible "unfairness" for a K-length subset of that set. Unfairness is defined as the difference between the max and min of a K-length subset.
The way I'm going about it now is to find all K-length subsets and calculate their unfairness, keeping track of the smallest unfairness.
I wrote the following C++ program that seems to the problem correctly:
#include <cmath>
#include <cstdio>
#include <iostream>
using namespace std;
int unfairness = -1;
int N, K, minc, maxc, ufair;
int *candies, *subset;
void check() {
ufair = 0;
minc = subset[0];
maxc = subset[0];
for (int i = 0; i < K; i++) {
minc = min(minc,subset[i]);
maxc = max(maxc, subset[i]);
}
ufair = maxc - minc;
if (ufair < unfairness || unfairness == -1) {
unfairness = ufair;
}
}
void process(int subsetSize, int nextIndex) {
if (subsetSize == K) {
check();
} else {
for (int j = nextIndex; j < N; j++) {
subset[subsetSize] = candies[j];
process(subsetSize + 1, j + 1);
}
}
}
int main() {
cin >> N >> K;
candies = new int[N];
subset = new int[K];
for (int i = 0; i < N; i++)
cin >> candies[i];
process(0, 0);
cout << unfairness << endl;
return 0;
}
The problem is that HackerRank requires the program to come up with a solution within 3 seconds and that my program takes longer than that to find the solution for 12/16 of the test cases. For example, one of the test cases has N = 50 and K = 8; the program takes 8 seconds to find the solution on my machine. What can I do to optimize my algorithm? I am not very experienced with C++.
All you have to do is to sort all the numbers in ascending order and then get minimal a[i + K - 1] - a[i] for all i from 0 to N - K inclusively.
That is true, because in optimal subset all numbers are located successively in sorted array.
One suggestion I'd give is to sort the integer list before selecting subsets. This will dramatically reduce the number of subsets you need to examine. In fact, you don't even need to create subsets, simply look at the elements at index i (starting at 0) and i+k, and the lowest difference for all elements at i and i+k [in valid bounds] is your answer. So now instead of n choose k subsets (factorial runtime I believe) you just have to look at ~n subsets (linear runtime) and sorting (nlogn) becomes your bottleneck in performance.

How can I create an array with Fibonacci numbers up to a certain integer n?

So for an assignment I've been asked to create a function that will generate an array of fibonacci numbers and the user will then provide an array of random numbers. My function must then check if the array the user has entered contains any fibonacci numbers then the function will output true, otherwise it will output false. I have already been able to create the array of Fib numbers and check it against the array that the user enters however it is limited since my Fib array has a max size of 100.
bool hasFibNum (int arr[], int size){
int fibarray[100];
fibarray[0] = 0;
fibarray[1] = 1;
bool result = false;
for (int i = 2; i < 100; i++)
{
fibarray[i] = fibarray[i-1] + fibarray[i-2];
}
for (int i = 0; i < size; i++)
{
for(int j = 0; j < 100; j++){
if (fibarray[j] == arr[i])
result = true;
}
}
return result;
}
So basically how can I make it so that I don't have to use int fibarray[100] and can instead generate fib numbers up to a certain point. That point being the maximum number in the user's array.
So for example if the user enters the array {4,2,1,8,21}, I need to generate a fibarray up to the number 21 {1,1,2,3,5,8,13,21}. If the user enters the array {1,4,10} I would need to generate a fibarray with {1,1,2,3,5,8,13}
Quite new to programming so any help would be appreciated! Sorry if my code is terrible.
It is possible that I still don't understand your question, but if I do, then I would achieve what you want like this:
bool hasFibNum (int arr[], int size){
if (size == 0) return false;
int maxValue = arr[0];
for (int i = 1; i < size; i++)
{
if (arr[i] > maxValue) maxValue = arr[i];
}
int first = 0;
int second = 1;
while (second < maxValue)
{
for (int i = 0; i < size; i++)
{
if (arr[i] == first) return true;
if (arr[i] == second) return true;
}
first = first + second;
second = second + first;
}
return false;
}
Here is a function that returns a dynamic array with all of the Fibonacci numbers up to and including max (assuming max > 0)
std::vector<size_t> make_fibs( size_t max ) {
std::vector<size_t> retval = {1,1};
while( retval.back() < max ) {
retval.push_back( retval.back()+*(retval.end()-2) );
}
return retval;
}
I prepopulate it with 2 elements rather than keeping track of the last 2 separately.
Note that under some definitions, 0 and -1 are Fibonacci numbers. If you are using that, start the array off with {-1, 0, 1} (which isn't their order, it is actually -1, 1, 0, 1, but by keeping them in ascending order we can binary_search below). If you do so, change the type to an int not a size_t.
Next, a sketch of an implementation for has_fibs:
template<class T, size_t N>
bool has_fibs( T(&array)[N] ) {
// bring `begin` and `end` into view, one of the good uses of `using`:
using std::begin; using std::end;
// guaranteed array is nonempty, so
T m = *std::max_element( begin(array), end(array) ); will have a max, so * is safe.
if (m < 0) m = 0; // deal with the possibility the `array` is all negative
// use `auto` to not repeat a type, and `const` because we aren't going to alter it:
const auto fibs = make_fibs(m);
// d-d-d-ouble `std` algorithm:
return std::find_if( begin(array), end(array), [&fibs]( T v )->bool {
return std::binary_search( begin(fibs), end(fibs), v );
}) != end(array);
}
here I create a template function that takes your (fixed sized) array as a reference. This has the advantage that ranged-based loops will work on it.
Next, I use a std algorithm max_element to find the max element.
Finally, I use two std algorithms, find_if and binary_search, plus a lambda to glue them together, to find any intersections between the two containers.
I'm liberally using C++11 features and lots of abstraction here. If you don't understand a function, I encourage you to rewrite the parts you don't understand rather than copying blindly.
This code has runtime O(n lg lg n) which is probably overkill. (fibs grow exponentially. Building them takes lg n time, searching them takes lg lg n time, and we search then n times).

Why does quicksort take longer when sorted in descending order vs ascending order

I have code for quicksort and mergesort, and I've placed a global counter variable that gets incremented for every iteration(comparison) that is made. I would assume this would correspond to my rough asymptotic analysis. For merge sort it does, but for quicksort it doesn't. I don't understand why. I'm choosing the last element of the input array is the pivot in every iteration. I know that this is non-optimal, but that doesn't matter for the sake of this discussion. Since I'm choosing the last element, I would expect that both ascending and descending order arrays would result in O(n^2) comparisons. To be more specific, I would expect that the number of comparisons would be n choose 2, because you are adding n + n-1 + n-2 + n-3 + .... + 1 in the worst case. But this does not seem to be happening.
On an input size of 100,000, with the input sorted in descending order I get 705,082,704 iterations counted. I get the same number for an input array sorted in ascending order. But 100,000 choose 2 is around 5 billion. Why the discrepancy?
For merge sort, with an input of 100,000, I get approx 1.6 million iterations, which is seemingly correct.
The following is the code which includes my implementation of quicksort as well as my counting technique, both of which may be off and thus causing this discrepancy. Otherwise it must be that my logic is wrong about how many iterations this should take?
Also, as an aside, its curious to me that although the number of comparisons are the same in both the ascending and descending input arrays, the ascending version is 2-3 times as fast. What could account for that. Without further ado here is the code.
int counter = 0;
int compare (const void * a, const void * b)
{
return ( *(int*)a - *(int*)b );
}
int partition(int *array, int low, int high){
int firsthigh = low;
int pivot = high;
for(int i = low; i < high; i++)
{
counter++;
if(array[i] < array[pivot])
{
swap(array[i], array[firsthigh]);
firsthigh++;
}
}
swap(array[pivot],array[firsthigh]);
return firsthigh;
}
void quicksort(int *array, int low, int high){
int p;
if(low < high)
{
p = partition(array, low, high);
quicksort(array, low, p-1);
quicksort(array,p+1,high);
}
}
int main(){
int array[100000];
for(int i = 0; i < 100000; i++)
array[i] = i;
struct timeval start, end;
for(int i = 0; i < 100000; i++)
cout << array[i] << " ";
gettimeofday(&start, NULL);
//mergesort(array, 0, 99999);
quicksort(array, 0, 99999);
gettimeofday(&end, NULL);
long long time = (end.tv_sec * (unsigned int)1e6 + end.tv_usec) -
(start.tv_sec * (unsigned int)1e6 + start.tv_usec);
for(int i = 0; i < 100000; i++)
cout << array[i] << " ";
cout << endl;
cout << endl << endl << time/1000000.0 << endl;
cout << endl << counter;
}
If you want to count the number of iterations of the inner for loop, use long long. n*(n-1)/2 overflows an int for n = 100 000. If you want to count swaps, you should increment your counter whenever a swap is being done.
Two easy optimizations to make to this are:
pick the pivot randomly;
make the partitioning function more efficient by using Hoare partitioning: https://cs.stackexchange.com/questions/11458/quicksort-partitioning-hoare-vs-lomuto
There are others of course, but this should get you a decent algorithm.