Non-standard sorting algorithm for random unique integers

Non-standard sorting algorithm for random unique integers - c++

I have an array of at least 2000 random unique integers, each in range 0 < n < 65000.
I have to sort it and then get the index of a random value in the array. Each of these operations have to be as fast as possible. For searching the binary-search seems to serve well.
For sorting I used the standard quick sorting algorithm (qsort), but I was told that with the given information the standard sorting algorithms will not be the most efficient. So the question is simple - what would be the most efficient way to sort the array, with the given information? Totally puzzled by this.

I don't know why the person who told you that would be so peversely cryptic, but indeed qsort is not the most efficient way to sort integers (or generally anything) in C++. Use std::sort instead.
It's possible that you can improve on your implementation's std::sort for the stated special case (2000 distinct random integers in the range 0-65k), but you're unlikely to do a lot better and it almost certainly won't be worth the effort. The things I can think of that might help:
use a quicksort, but with a different pivot selection or a different threshold for switching to insertion sort from what your implementation of sort uses. This is basically tinkering.
use a parallel sort of some kind. 2000 elements is so small that I suspect the time to create additional threads will immediately kill any hope of a performance improvement. But if you're doing a lot of sorts then you can average the cost of creating the threads across all of them, and only worry about the overhead of thread synchronization rather than thread creation.
That said, if you generate and sort the array, then look up just one value in it, and then generate a new array, you would be wasting effort by sorting the whole array each time. You can just run across the array counting the number of values smaller than your target value: this count is the index it would have. Use std::count_if or a short loop.
Each of these operations have to be as fast as possible.
That is not a legitimate software engineering criterion. Almost anything can be made a minuscule bit faster with enough months or years of engineering effort -- nothing complex is ever "as fast as possible", and even if it was you wouldn't be able to prove that it cannot be faster, and even if you could there would be new hardware out there somewhere or soon to be invented for which the fastest solution is different and better. Unless you intend to spend your whole life on this task and ultimately fail, get a more realistic goal ;-)

For sorting uniformly distributed random integers Radix Sort is typically the fastest algorithm, it can be faster than quicksort by a factor of 2 or more. However, it may be hard to find an optimized implementation of that, quick sort is much more ubiquitous. Both Radix Sort and Quick Sort may have very bad worst case performance, like O(N^2), so if worst case performance is important you have to look elsewhere, maybe you pick introsort, which is similar to std::sort in C++.
For array look up a hash table is by far the fasted method. If you don't want yet another data structure, you can always pick binary search. If you have uniformly distributed numbers interpolation search is probably the most effective method (best average performance).

Quicksort's complexity is O(n*log(n)), where n = 2000 in your case. log(2000) = 10.965784.
You can sort in O(n) using one of these algorithms:
Counting sort
Radix sort
Bucket sort
I've compared std::sort() to counting sort for N = 100000000:
#include <iostream>
#include <vector>
#include <algorithm>
#include <time.h>
#include <string.h>
using namespace std;
void countSort(int t[], int o[], int c[], int n, int k)
{
// Count the number of each number in t[] and place that value into c[].
for (int i = 0; i < n; i++)
c[t[i]]++;
// Place the number of elements less than each value at i into c[].
for (int i = 1; i <= k; i++)
c[i] += c[i - 1];
// Place each element of t[] into its correct sorted position in the output o[].
for (int i = n - 1; i >= 0; i--)
{
o[c[t[i]] - 1] = t[i];
--c[t[i]];
}
}
void init(int t[], int n, int max)
{
for (int i = 0; i < n; i++)
t[i] = rand() % max;
}
double getSeconds(clock_t start)
{
return (double) (clock() - start) / CLOCKS_PER_SEC;
}
void print(int t[], int n)
{
for (int i = 0; i < n; i++)
cout << t[i] << " ";
cout << endl;
}
int main()
{
const int N = 100000000;
const int MAX = 65000;
int *t = new int[N];
init(t, N, MAX);
//print(t, N);
clock_t start = clock();
sort(t, t + N);
cout << "std::sort " << getSeconds(start) << endl;
//print(t, N);
init(t, N, MAX);
//print(t, N);
// o[] holds the sorted output.
int *o = new int[N];
// c[] holds counters.
int *c = new int[MAX + 1];
// Set counters to zero.
memset(c, 0, (MAX + 1) * sizeof(*c));
start = clock();
countSort(t, o, c, N, MAX);
cout << "countSort " << getSeconds(start) << endl;
//print(o, N);
delete[] t;
delete[] o;
delete[] c;
return 0;
}
Results (in seconds):
std::sort 28.6
countSort 10.97
For N = 2000 both algorithms give 0 time.

Standard sorting algorithms, as well as standard nearly anything, are very good general purpose solution. If you know nothing about your data, if it truly consists of "random unique integers", then you might as well go with one of the standard implementations.
On the other hand, most programming problems appear in a context that tells something about data, and the additional info usually leads to more efficient problem-specific solutions.
For example, does your data appear all at once or in chunks? If it comes piecemeal you may speed things up by interleaving incremental sorting, such as dual-pivot quicksort, with data acquisition.

Since the domain of your numbers is so small, you can create an array of 65000 entries, set the index of the numbers you see to one, and then collect all numbers that are set to one as your sorted array. This will be exactly 67000 (assuming initialization of array is without cost) iterations in total.
Since the lists contain 2000 entries, O(n*log(n)) will probably be faster. I can think of no other O(n) algorithm for this, so I suppose you are better off with a general purpose algorithm.

Related

Bucket sort or merge sort?

I am doing an c++ assignment where I have to sort data (n=400) which is student scores from 0-100. I am confused on using bucket sort, which sorts algorithm into buckets or merge sort, which divides and conquers. Which one should I use and why?

The answer depends on your data. However, merge sort will run in O(n log n) while bucket sort will run in O(n + b) where b is the number of buckets you have. If scores are from zero to (and including) 100, then b is 101. So the question is of O(n log n) runs faster than O(n + 101) which is an easy question to answer theoretically, since O(n + 101) = O(n) and clearly O(n) is faster than O(n log n). Even if we did the (admittedly silly) exercise of substituting n for 400 we would get 501 for bucket sort and with log2(400) = 9 (rounded up) 3600 for merge sort. But that is silly, because big-O notation doesn't work that way. Theoretically, we would just conclude that O(n) is better than O(n log n).
But that is the theoretical answer. In practise, the overhead hidden behind the big-O counts, and then it might not be as simple.
That being said, the overhead in bucket sort is usually smaller than for merge sort. You need to allocate an array for some counts and an array to put the output in, and after that you need to run through the input twice, first for counting and then for sorting. A simple bucket sort could look like this:
#include <iostream>
#include <string>
// Some fake data
struct student
{
int score;
std::string name;
};
struct student scores[] = {
{45, "jack"},
{12, "jill"},
{99, "john"},
{89, "james"}};
void bucket_sort(int n, struct student in[n], struct student out[n])
{
int buckets[101]; // range 0-100 with 100 included
for (int i = 0; i < 101; i++)
{
buckets[i] = 0;
}
// get offsets for each bucket
for (int i = 0; i < n; i++)
{
buckets[in[i].score]++;
}
int acc = 0;
for (int i = 0; i < 101; i++)
{
int b = buckets[i];
buckets[i] = acc;
acc += b;
}
// Bucket the scores
for (int i = 0; i < n; i++)
{
out[buckets[in[i].score]++] = in[i];
}
}
void print_students(int n, struct student students[n])
{
for (int i = 0; i < n; i++)
{
std::cout << students[i].score << ' ' << students[i].name << std::endl;
}
std::cout << std::endl;
}
int main(void)
{
int no_students = sizeof scores / sizeof scores[0];
print_students(no_students, scores);
struct student sorted[no_students];
bucket_sort(no_students, scores, sorted);
print_students(no_students, sorted);
return 0;
}
(excuse my C++, it's been more than 10 years since I used the language, so the code might look a bit more C like than it should).
The best way to work out what is faster in practise is, of course, to measure it. Compare std::sort with something like the above, and you should get your answer.
If it wasn't for an assignment, though, I wouldn't recommend you to experiment. The built-in std::sort can easily handle 400 elements faster than you need, and there is no need to implement new sorting algorithms for something like that. For an exercise, though, it could be fun to do some measuring and experiments.

Update
Read Thomas Mailund's answer first. He provided more relevant answer to this specific question. Since the scores are likely to be in integers, histogram sort (variant of bucket sort) should be faster than merge sort!
Bucket sort performs poorly when the data set is not distributed well since most of the items will fall into a few popular buckets. In your case, It's reasonable to assume that the most of the student scores will more or less be around the median score and only have few outliers. Therefore I would argue that the merge sort performs better in this context since it is not affected by the distribution of the data set.
Additional Consideration
There could be an argument that bucket sort is better if we can adjust the bucket ranges according to the expected distribution of the data set. Sure, if we hit the jackpot and predicted the distribution really well, it can significantly speed up the sorting process. However, the downside of this is that the sorting performance can plummet when our prediction goes wrong i.e. getting unexpected data set. For example, the test being too easy/difficult might lead to this "unexpected data set" in the context of this question. In other words, bucket sort has better best-case time complexity, where as merge sort has better worst-case time complexity. Which metric to use for comparing algorithms depends on the needs of each application. In practice, the worst-case time complexity is usually found to be more useful and I think the same could be said for this specific question. It's also a plus that we don't suffer the additional cost of calculating/adjusting the bucket ranges if we go for merge sort.

The question is not precise enough: I have to sort data (n=400) which is student scores from 0-100.
If the grades are integers, bucket sort with 1 bucket per grade, also called histogram sort or counting sort will do the job in linear time as illustrated in Thomas Mailund's answer.
If the grades are decimal, bucket sort will just add complexity and given the sample size, mergesort will do just fine in O(n.log(n)) time with a classic implementation.
If the goal of the question is for you to implement a sorting algorithm, the above applies, otherwise you should just use std::sort in C++ or qsort in C with an appropriate comparison function.

Optimize counting sort?

Given that the input will be N numbers from 0 to N (with duplicates) how I can optimize the code bellow for both small and big arrays:
void countingsort(int* input, int array_size)
{
int max_element = array_size;//because no number will be > N
int *CountArr = new int[max_element+1]();
for (int i = 0; i < array_size; i++)
CountArr[input[i]]++;
for (int j = 0, outputindex = 0; j <= max_element; j++)
while (CountArr[j]--)
input[outputindex++] = j;
delete []CountArr;
}
Having a stable sort is not a requirement.
edit: In case it's not clear, I am talking about optimizing the algorithm.

IMHO there's nothing wrong here. I highly recommend this approach when max_element is small, numbers sorted are non sparse (i.e. consecutive and no gaps) and greater than or equal to zero.
A small tweak, I'd replace new / delete and just declare a finite array using heap, e.g. 256 for max_element.
int CountArr[256] = { }; // Declare and initialize with zeroes
As you bend these rules, i.e. sparse, negative numbers you'd be struggling with this approach. You will need to find an optimal hashing function to remap the numbers to your efficient array. The more complex the hashing becomes the benefit between this over well established sorting algorithms diminishes.

In terms of complexity this cannot be beaten. It's O(N) and beats standard O(NlogN) sorting by exploiting the extra knowledge that 0<x<N. You cannot go below O(N) because you need at least to swipe through the input array once.

How do I find the frequency of a given number into a range into an array?

The problem is:
You are given an array of size N. Also given q=number of queries; in queries you will be given l=lower range, u=upper range and num=the number of which you will have to count frequency into l~u.
I've implemented my code in C++ as follows:
#include <iostream>
#include <map>
using namespace std;
map<int,int>m;
void mapnumbers(int arr[], int l, int u)
{
for(int i=l; i<u; i++)
{
int num=arr[i];
m[num]++;
}
}
int main()
{
int n; //Size of array
cin>>n;
int arr[n];
for(int i=0; i<n; i++)
cin>>arr[i];
int q; //Number of queries
cin>>q;
while(q--)
{
int l,u,num; //l=lower range, u=upper range, num=the number of which we will count frequency
cin>>l>>u>>num;
mapnumbers(arr,l,u);
cout<<m[num]<<endl;
}
return 0;
}
But my code has a problem, in each query it doesn't make the map m empty. That's why if I query for the same number twice/thrice it adds the count of frequency with the previous stored one.
How do I solve this?
Will it be a poor program for a large range of query as 10^5?
What is an efficient solution for this problem?

You can solve the task using SQRT-decomposition of queries. The complexity will be
O(m*sqrt(n)). First of all, sort all queries due to the following criteria: L/sqrt(N) should be increasing, where L is the left bound of query. For equal L/sqrt(N), R (right bounds) should be increasing too. N is the number of queries. Then do this: calculate answer for first query. Then, just move the bounds of this query to the bounds of the next query one by one. For example, if your first query after sort is [2,7] and second is [1, 10], move left bound to 1 and decrease the frequency of a[2], increase the frequency of a1. Move the right bound from 7 to 10. Increase the frequency of a[8], a[9] and a[10]. Increase and decrease frequencies using your map. This is a very complicated technique, but it allows to solve your task with good complexity. You can read more about SQRT-decomposition of queries here: LINK

To clear the map, you need to call map::clear():
void mapnumbers(int arr[], int l, int u)
{
m.clear()
A better approach to the clearing problem is to make m a local variable for the while (q--) loop, or even for the mapnumbers function.
However, in general it is very strange why you need map at all. You traverse the whole array anyway, and you know the number you need to count, so why not do
int mapnumbers(int arr[], int l, int u, int num)
{
int result = 0;
for(int i=l; i<u; i++)
{
if (arr[i] == num);
result ++;
}
return result;
}
This will be faster, even asymptotically faster, as map operations are O(log N), so your original solution ran for O(N log N) per query, while this simple iteration runs for O(N).
However, for a really big array and many queries (I guess the problem comes from some competitive programming site, does not it?), this still will not be enough. I guess there should be some data structure and algorithm that allows for O(log N) query, though I can not think of any right now.
UPD: I have just realized that the array does not change in your problem. This makes it much simpler, allowing for a simple O(log N) per query solution. You just need to sort all the numbers in the input array, remembering their original positions too (and making sure the sort is stable, so that the original positions are in increasing order); you can do this only once. After this, every query can be solved with just two binary searches.

Many Algorithms are available for this kind of problems . This looks like a straight forward data structure problem . You can use Segment tree , Square Root Decomposition . Check Geeksforgeeks for the algorithm ! The reason i am telling you to learn algorithm is , this kind of problems have such large constrains , your verdict will be TLE if you use your method . So better using Algorithms .

Many answers here are much way complicated. I am going to tell you easy way to find range frequency. You can use the binary search technique to get the answer in O(logn) per query.
For that, use arrays of vector to store the index values of all numbers present in the array and then use lower_bound and upper_bound provided by C++ STL.
Here is C++ Code:
#define MAX 1000010
std::vector<int> v[MAX];
int main(){
cin>>n;
for (int i = 0; i < n; ++i)
{
cin>>a;
v[a].push_back(i);
}
int low = 0, high = 0;
int q; //Number of queries
cin>>q;
while(q--)
{
int l,u,num; //l=lower range, u=upper range, num=the number of which we will count frequency
cin>>l>>u>>num;
low = lower_bound(v[num].begin(), v[num].end(), l) - v[num].begin();
high = upper_bound(v[num].begin(), v[num].end(), u) - v[num].begin();
cout<<(high - low)<<endl;
}
return 0;
}
Overall Time Complexity: O(Q*log n)

find all unique triplet in given array with sum zero with in minimum execution time [duplicate]

This question already has an answer here:
Finding three elements that sum to K
(1 answer)
Closed 7 years ago.
I've got all unique triplets from code below but I want to reduce its time
complexity. It consists of three for loops. So my question is: Is it possible to do in minimum number of loops that it decreases its time complexity?
Thanks in advance. Let me know.
#include <cstdlib>
#include<iostream>
using namespace std;
void Triplet(int[], int, int);
void Triplet(int array[], int n, int sum)
{
// Fix the first element and find other two
for (int i = 0; i < n-2; i++)
{
// Fix the second element and find one
for (int j = i+1; j < n-1; j++)
{
// Fix the third element
for (int k = j+1; k < n; k++)
if (array[i] + array[j] + array[k] == sum)
cout << "Result :\t" << array[i] << " + " << array[j] << " + " << array[k]<<" = " << sum << endl;
}
}
}
int main()
{
int A[] = {-10,-20,30,-5,25,15,-2,12};
int sum = 0;
int arr_size = sizeof(A)/sizeof(A[0]);
cout<<"********************O(N^3) Time Complexity*****************************"<<endl;
Triplet(A,arr_size,sum);
return 0;
}

I'm not a wiz at algorithms but a way I can see making your program better is to do a binary search on your third loop for the value that will give you your sum in conjunction with the 2 previous values. This however requires your data to be sorted beforehand to make it work properly (which obviously has some overhead depending on your sorting algorithm (std::sort has an average time complexity of O (n log n))) .
You can always if you want to make use of parallel programming and make your program run off multiple threads but this can get very messy.
Aside from those suggestions, it is hard to think of a better way.

You can get a slightly better complexity of O(n^2*logn) easily enough if you first sort the list and then do a binary search for the third value. The sort takes O(nlogn) and the triplets search takes O(n^2) to ennumerate all the possible pairs that exist times O(logn) for the binary search of the thrid value for a total of O(nlogn + n^2logn) or simply O(n^2*logn).
There might be some other fancy things with binary search you can do to reduce that, but I can't see easily (at 4:00 am) anything better than that.

When a triplet sums to zero, the third number is completely determined by the two first. Thus, you're only free to choose two of the numbers in each triple. With n possible numbers this yields maximum n2 triplets.
I suspect, but I'm not sure, that that's the best complexity you can do. It's not clear to me whether the number of sum-to-zero triplets, for a random sequence of signed integers, will necessarily be on the order of n2. If it's less (not likely, but if) then it might be possible to do better.
Anyway, a simple way to do this with complexity on the order of n2 is to first scan through the numbers, storing them in a data structure with constant time lookup (the C++ standard library provides such). Then scan through the array as your posted code does, except vary only on the first and second number of the triple. For the third number, look it up in the constant time look-up data structure already established: if it's there then you have a potential new triple, otherwise not.
For each zero-sum triple thus found, put it also in a constant time look-up structure.
This ensures the uniqueness criterion at no extra complexity.

In the worst case, there are C(n, 3) triplets with sum zero in an array of size n. C(n, 3) is in Θ(n³), it takes Θ(n³) time just to print the triplets. In general, you cannot get better than cubic complexity.

Fast generation of random set, Monte Carlo Simulation

I have a set of numbers ~100, I wish to perform MC simulation on this set, the basic idea is I fully randomize the set, do some comparison/checks on the first ~20 values, store the result and repeat.
Now the actual comparison/check algorithm is extremely fast it actually completes in about 50 CPU cycles. With this in mind, and in order to optimize these simulations I need to generate the random sets as fast as possible.
Currently I'm using a Multiply With Carry algorithm by George Marsaglia which provides me with a random integer in 17 CPU cycles, quite fast. However, using the Fisher-Yates shuffling algorithm I have to generate 100 random integers, ~1700 CPU cycles. This overshadows my comparison time by a long ways.
So my question is are there other well known/robust techniques for doing this type of MC simulation, where I can avoid the long random set generation time?
I thought about just randomly choosing 20 values from the set, but I would then have to do collision checks to ensure that 20 unique entries were chosen.
Update:
Thanks for the responses. I have another question with regards to a method I just came up with after my post. The question is, will this provide a robust truly (assuming the RNG is good) random output. Basically my method is to set up an array of integer values the same length as my input array, set every value to zero. Now I begin randomly choosing 20 values from the input set like so:
int pcfast[100];
memset(pcfast,0,sizeof(int)*100);
int nchosen = 0;
while (nchosen<20)
{
int k = rand(100); //[0,100]
if ( pcfast[k] == 0 )
{
pcfast[k] = 1;
r[nchosen++] = s[k]; // r is the length 20 output, s the input set.
}
}
Basically what I mentioned above, choosing 20 values at random, except it seems like a somewhat optimized way of ensuring no collisions. Will this provide good random output? Its quite fast.

If you only use the first 20 values in the randomised array, then you only need to do 20 steps of the Fisher-Yates algorithm (Knuth's version). Then 20 values have been randomised (actually at the end of the array rather than at the beginning, in the usual formulation), in the sense that the remaining 80 steps of the algorithm are guaranteed not to move them. The other 80 positions aren't fully shuffled, but who cares?
C++ code (iterators should be random-access):
using std::swap;
template <typename Iterator, typename Rand> // you didn't specify the type
void partial_shuffle(Iterator first, Iterator middle, Iterator last, Rand rnd) {
size_t n = last - first;
while (first != middle) {
size_t k = rnd(n); // random integer from 0 to n-1
swap(*(first+k),*first);
--n;
++first;
}
}
On return, the values from first through to middle-1 are shuffled. Use it like this:
int arr[100];
for (int i = 0; i < 100; ++i) arr[i] = i;
while (need_more_samples()) {
partial_shuffle(arr, arr+20, arr+100, my_prng);
process_sample(arr, arr+20);
}

The Ross simulation book suggests something like the following:
double return[10];
for(int i=0, n=100; i < 10; i++) {
int x = rand(n); //pseudocode - generate an integer on [0,n]
return[i] = arr[x];
arr[x] = arr[n];
n--;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js