Radix Sort base 256 Performance - c++

I'm trying to implement Radix sort with base 256 using Lists. The sort works fine but it takes to long to sort big arrays, in addition the complexity should be linear, O(n), but i'm not getting that result as i'm timing the sort in the output. Here is my code:
Insert Function:
//insert to the back of the list element pointed to by x
void insert(Item * ls, Item * x)
{
x->prev = ls->prev;
ls->prev->next=x;
x->next=ls;
ls->prev=x;
}
Delete Function:
//delete link in list whose address is x
void delete_x(Item * x)
{
x->prev->next = x->next;
x->next->prev = x->prev;
delete [] x;
}
Radix_Sort Function:
void radix_sort_256(unsigned int *arr,unsigned int length)
//Radix sort implementation with base 256
{
int num_of_digits=0,count=0,radix_num=0;
unsigned int base=0,largest=0;
Item List [256]; //Creating 256 Nodes ( Base 256 )
for(int j=0; j<256;j++) // Sentinel Init for each Node
{
List[j].key=0;
List[j].next=&List[j];
List[j].prev=&List[j];
}
for(unsigned int i=0; i<length ; i++) //Finding the largest number in the array
{
if(arr[i]>largest)
largest = arr[i];
}
while(largest != 0 ) //Finding the total number of digits in the bigest number( "largest" ) of the array.
{
num_of_digits++;
largest = largest >> 8;
}
for(int i=0; i<num_of_digits; i++)
{
Item *node;
for(unsigned int j=0; j<length; j++)
{
node = new Item; //Creating a new node(Total 256 nodes) and inserting numbers from the array to each node
node->next = NULL; // with his own index.
node->prev = NULL;
node->key = arr[j];
radix_num = ( arr[j] >> (8*i) ) & 0xFF;
insert(&List[radix_num],node);
}
for(int m=0 ; m<256 ; m++) //checking the list for keys // if key found inserting it to the array in the original order
{
while( List[m].next != &List[m] )
{
arr[count]=List[m].next->key;
delete_x(List[m].next); //deleting the Item after the insertion
count++;
}
}
count=0;
}
}
Main:
void main()
{
Random r;
int start,end;
srand((unsigned)time(NULL));
// Seting up dinamic array in growing sizes,
// filling the arrayes with random
for(unsigned int i=10000 ; i <= 1280000; i*=2)
{
// numbers from [0 to 2147483646] calling the radix
// sort function and timing the results
unsigned int *arr = new unsigned int [i];
for(int j=0 ; j<i ; j++)
{
arr[j] = r.Next()-1;
}
start = clock();
radix_sort_256(arr,i);
end = clock();
cout<<i;
cout<<" "<<end-start;
if(Sort_check(arr,i))
cout<<"\t\tArray is sorted"<<endl;
else
cout<<"\t\tArray not sorted"<<endl;
delete [] arr;
}
}
Can anyone see, maybe i'm doing some unnecessary actions that take great deal of time to execute?

Complexity is a difficult beast to master, because it is polymorphic.
When we speak about the complexity of an algorithm, we generally simplify it and express it according to what we think being the bottleneck operation.
For example, when evaluating sorting algorithms, the complexity is expressed as the number of comparisons; however, should your memory be a tape1 instead of RAM, the true bottleneck is the memory access and therefore a quicksort O(N log N) ends up being slower than a bubblesort O(N ** 2).
Here, your algorithm may be optimal, its implementation seems lacking: there is a lot of memory allocation/deallocation going on, for example. Therefore, it may well be that you did not identified the bottleneck operation correctly, and that all talk of linear complexity are moot since you are not measuring the right things.
1 because tapes take a time to move from one cell to another proportional to the distance between those cells, and thus a quicksort algorithms that keep jumping around memory ends up doing a lot of back and forth whilst a bubble sort algorithm just runs the length of the tape N times (max).

Radix sort with base 256 could easily look something like this.
void sort(int *a, int n)
{
int i, *b, exp = 1, max = 0;
for (i = 0; i < n; i++) {
if (a[i] > max)
max = a[i];
}
b = (int*)malloc(n * sizeof(int));
while (max / exp > 0) {
int box[256] = {0};
for (i = 0; i < n; i++)
box[a[i] / exp % 256]++;
for (i = 1; i < 256; i++)
box[i] += box[i - 1];
for (i = n - 1; i >= 0; i--)
b[--box[a[i] / exp % 256]] = a[i];
for (i = 0; i < n; i++)
a[i] = b[i];
exp *= 256;
}
free(b);
}

Related

Why is one algorithm faster than another with the same time complexity?

I was working on a codingame challenge : the horse racing dual.
The goal is to find the minimum difference between two elements of a list.
I started with this first algorithm, which is i think in O(nlog(n)) but the execution was timing out for large arrays.
int array[N];
int min = numeric_limits<int>::max();
for (int i = 0; i < N; i++) {
int value;
cin >> value;
cin.ignore();
array[i] = value;
for (int j = i - 1; j >= 0; --j) {
int diff = abs(array[j] - value);
if (diff < min) {
min = diff;
}
}
}
I then tried this other algorithm which is also O(nlog(n))and this time the execution finishes in time.
int array[N];
int min = numeric_limits<int>::max();
for (int i = 0; i < N; i++) {
int value;
cin >> value;
cin.ignore();
array[i] = value;
}
sort(array, array + N);
for (int i = 1; i < N; ++i) {
int diff = abs(array[i - 1] - array[i]);
if (diff < min) {
min = diff;
}
}
Am I wrong with the first code complexity ? Is there any difference that I did not notice ?
Thanks for your help.
Am I wrong with the first code complexity?
Yes, you are wrong, this complexity is not O(n log n), but O(n^2) instead.
The outer loop runs n (N) times while the inner loop runs n/2 times in average. Thus, the complexity is O(n * n/2) which is O(n^2), since multiplicative constants doesn't matter in complexity calculations.
Is there any difference that I did not notice?
Yes, there is. Even if you have two algorithms with the very same complexity, such as O(n log n), they both can run in very different times due to hidden constants, which are ignored in asymptotic complexity behavior.

Maximum sub-arrays that an array can be divided into such that GCD of any two elements in different sub-arrays is always 1?

Given n, the number of array elements and arr[n], the array of numbers, it is required to find the maximum number of sub-arrays the array can be divided into such that GCD(a,b)=1 for every a and b that belong to different sub-arrays.
Eg:
5
2 3 4 5 6
Ans: 2 ----> {(2,3,4,6),(5)}
Every other attempt to divide it further will not satisfy the conditions.
My Approach:
1. Sort the array.
2. Keep calculating the lcmof the elements.
3. Increase the counter every time the gcd of the element and lcm of elements before is 1.
int main()
{
int n;
cin>>n;
long long int arr[n];
for(int i=0;i<n;++i)
cin>>arr[i];
sort(arr,arr+n);
long long int ans=1,l=arr[n-1];
for(int i=n-2;i>=0;i--)
{
if(gcd(l,arr[i])==1)
ans++;
l=lcm(l,arr[i]);
}
cout<<ans<<endl;
return 0;
}
After my answer being judged wrong answer multiple times, I am confused whether my solution is correct. Since the limit for n was 10^6 and array element was 10^7, another reason the solution would have failed is that the LCM can exceed the long long limit. Is there any other solution possible? Or is there any mistake in the present approach?
I think this is the problem you are referring to: https://www.codechef.com/problems/CHEFGRUP
My approach is as follows (I got Time Limit Exceeded):
Step - 1: Calculate all the primes in the range [1, 10^7].
This can be done using Sieve of Eratosthenes and the complexity will be O(nlog(log(n)) where n can be upto 10^7.
Step - 2: Use the vector of primes calculated above to find prime factorization of all the numbers in the array.
This can be implemented very efficiently once we have all the required primes.
The point to note in this step is that, suppose we have 2 numbers whose prime factorization contains common prime numbers, then these two elements cannot be in different subarrays because then GCD won't be 1 (as required in the question). Hence, for all such pairs, they will have to be in the same subarray. How to achieve this?
Step - 3: Use Disjoint Set Data Structure.
We can create a disjoint set of all the prime numbers. So the number of sets in the beginning will be the number of prime numbers. Then, during each factorization, we will join all the prime numbers that is a divisor and add them all in the same group with the original number. This will be repeated for all the numbers.
Also, we will have to check once, whether some prime numbers was even required in the first place. Because before this step we just assumed that there are as many sets as the prime numbers in the range. But some might be unused. So, this can be checked by traversing a loop once and finding the number of unique representatives. This will be our answer.
My code:
#include <bits/stdc++.h>
using namespace std;
typedef long long int ll;
int prime[(int)1e7+10] = {0};
struct union_find {
std::vector <int> parent, rank;
// Constructor to initialse 'parent' and 'rank' vector.
union_find(int n) {
parent = std::vector <int> (n);
rank = std::vector <int> (n, 0); // initialse rank vector with 0.
for(int i = 0; i < n; i++)
parent[i] = i;
}
// Find with Path Compression Heuristic.
int find_(int a) {
if(a == parent[a])
return a;
return parent[a] = find_(parent[a]);
}
// Union by checking rank to keep the depth of the tree as shallow as possible.
void union_(int a, int b) {
int aa = find_(a), bb = find_(b);
if(rank[aa] < rank[bb])
parent[aa] = bb;
else
parent[bb] = aa;
if(rank[aa] == rank[bb])
++rank[aa];
}
};
union_find ds(1e7+10);
int main() {
int n;
int sq = sqrt(1e7+10);
for(int i = 4; i < 1e7+10; i += 2)
prime[i] = 1;
for(int i = 3; i <= sq; i += 2) {
if(!prime[i]) {
for(int j = i*i; j < 1e7+10; j += i)
prime[j] = 1;
}
}
vector <int> primes;
primes.push_back(2);
for(int i = 3; i < 1e7+10; i += 2) {
if(!prime[i])
primes.push_back(i);
}
scanf("%d", &n);
int a[n];
for(int i = 0; i < n; i++) {
scanf("%d", &a[i]);
}
for(int i = 0; i < n; i++) {
int temp = a[i];
// int sq = sqrt(temp);
vector <int> divisors;
for(int j = 0; j < primes.size(); j++) {
if(primes[j] > temp)
break;
if(temp % primes[j] == 0) {
divisors.push_back(primes[j]);
while(temp % primes[j] == 0) {
temp /= primes[j];
}
}
}
if(temp > 2)
divisors.push_back(temp);
for(int i = 1; i < divisors.size(); i++)
ds.union_(divisors[i], divisors[i-1]);
if(divisors.size() > 0)
ds.union_(divisors[0], a[i]);
}
set <int> unique;
for(int i = 0; i < n; i++) {
int x = ds.find_(a[i]);
unique.insert(x);
}
printf("%d\n", unique.size());
return 0;
}

Bubble Sort Using Slides instead of swaps

currently I'm being asked to design four sorting algorithms (insertion, shell, selection, and bubble) and I have 3 of the 4 working perfectly; the only one that isn't functioning correctly is the Bubble Sort. Now, I'm well aware of how the normal bubble sort works with using a temp var to swap the two indexes, but the tricky part about this is that it needs to use the array index[0] as a temp instead of a normal temp, which is used in swapping, and slide the lower array variables down to the front of the list and at the end of the pass assign the last index to the temp which is the greatest value.
I've been playing around with this for a while and even tried to look up references but sadly I cannot find anything. I'm hoping that someone else has done this prior and can offer some helpful tips. This is sort of a last resort as I've been modifying and running through the passes with pen and paper to try and find my fatal error. Anyways, my code is as follows...
void BubbleSort(int TheArray[], int size)
{
for (int i = 1; i < size + 1; i++)
{
TheArray[0] = TheArray[i];
for (int j = i + 1; j < size; j++)
{
if (TheArray[j] > TheArray[0])
TheArray[0] = TheArray[j];
else
{
TheArray[j - 1] = TheArray[j];
}
}
TheArray[size- 1] = TheArray[0];
}
}
Thanks for any feedback whatsoever; it's much appreciated.
If I understand the problem statement, I think you're looking for something along these lines :
void BubbleSort(int theArray[], int size)
{
for (int i = 1; i < size + 1; i++)
{
theArray[0] = theArray[1];
for (int j = 1; j <= size + 1 - i; j++)
{
if (theArray[j] > theArray[0])
{
theArray[j-1] = theArray[0];
theArray[0] = theArray[j];
}
else
{
theArray[j - 1] = theArray[j];
}
}
theArray[size-i+1] = theArray[0];
}
}
The piece that you're code was missing, I think, was that once you find a new maximum, you have to put it back in the array before placing the new maximum in theArray[0] storage location (see theArray[j-1] = theArray[0] after the compare). Additionally, the inner loop wants to run one less each time since the last element will be the current max value so you don't want to revisit those array elements. (See for(int j = 1 ; j <= size + 1 - i ; j++))
For completeness, here's the main driver I used to (lightly) test this :
int main()
{
int theArray[] = { 0, 5, 7, 3, 2, 8, 4, 6 };
int size = 7;
BubbleSort(theArray, size);
for (int i = 1; i < size + 1; i++)
cout << theArray[i] << endl;
return 0;
}

Count triplets which satisfy given condition [duplicate]

This question already has answers here:
Number of all increasing subsequences in given sequence?
(7 answers)
Closed 8 years ago.
Given an array A of size N I need to count such triplets (i,j,k) such that:
Condition 1 : i < j < k
Condition 2 : A[i] > A[j] > A[k]
I know a O(N^3) solution to do it. Can their be something like O(N) or O(NlogN) solution to do this problem as N can be up to 100000
Example : Let N=4 and array be [4,3,2,1] then answer is 4 as {4,3,2},{4,3,1},{4,2,1} and {3,2,1} are all possible answers
How to find this count for given N and array A?
My Approach :
int n;
cin>>n;
vector<int> A(n);
for(int i=0;i<n;i++){
cin>>A[i];
}
int count=0;
for(int i=0;i<n;i++){
for(int j=i+1;j<n;j++){
for(int k=j+1;k<n;k++){
if(A[i]>A[j] && A[j]>A[k]){
count++;
}
}
}
}
cout<<count<<"\n";
First, sort the array, maintain the index of each element.
class Node{
int index, val;
}
For comparing two nodes, we first need to compare their values. If the values equals, we will compare their index, consider a node is greater if its index is smaller.
Now, process each node in sorted order, we try to add each node's index into a Fenwick tree. So, for each index i, we query the tree for the frequency of this index, which added previously in the tree. This is the number of index that has value greater than value of the current index.
Note for the case elements have equal value, by the sorting mechanism mentioned above, we will add those have greater index to the tree first, thus, doesn't affect the frequency value query from the tree.
Apply similar step to obtains those elements that smaller than i and has index j < i.
For example:
If we have an array
{0(1) ,1(2) , 2(2) ,3(4) , 4(4) ,5(4) ,6(1)} //index(value)
After sort -> {5(4), 4(4), 3(4), 2(2), 1(2), 6(1), 0(1) }
Pseudo code
Node[]data;
sort(data)
Fenwick tree;
int[]less;
int[]more;
for(int i = 0; i < data.length; i++){
less[data[i].index] = tree.query(data[i].index);
tree.add(data[i].index, 1);
}
tree.clear();
for(int i = data.length - 1; i >= 0; i--){
more[data[i].index] = tree.query(data.length) -tree.query(data[i].index);
tree.add(data[i].index, 1);
}
int result = 0;
for(int i = 0; i < data.length; i++)
result += more[i]*less[i];
Time complexity will be O(n logn).
Working Java code (FT is my Fenwick tree)
PrintWriter out;
Scanner in = new Scanner(System.in);
out = new PrintWriter(System.out);
int n = in.nextInt();
Node[] data = new Node[n];
for (int i = 0; i < n; i++) {
data[i] = new Node(i + 1, in.nextInt());
}
FT tree = new FT(n + 2);
Arrays.sort(data, new Comparator<Node>() {
#Override
public int compare(Node o1, Node o2) {
if (o1.val != o2.val) {
return o2.val - o1.val;
}
return o2.index - o1.index;
}
});
int[] less = new int[n];//Store all nodes with greater index and smaller value;
int[] greater = new int[n];//Store all nodes with smaller index and greater value
for (int i = 0; i < n; i++) {
greater[data[i].index - 1] = (int) tree.get(data[i].index);
tree.update(data[i].index, 1);
}
tree = new FT(n + 2);
for (int i = n - 1; i >= 0; i--) {
less[data[i].index - 1] = (int) (tree.get(n) - tree.get(data[i].index));
tree.update(data[i].index, 1);
}
long total = 0;
for (int i = 0; i < n; i++) {
total += less[i] * greater[i];
}
out.println(total);
out.close();
You can do this in O(n*n) pretty easily, you just need to keep track of how many smaller number each element had:
vector<int> smallerNumbers(A.size());
for (int i = A.size() - 2; i >= 0; --i){
for (int j = i + 1; j < A.size(); ++j){
if (A[i] > A[j]){
smallerNumbers[i]++;
count += smallerNumbers[j];
}
}
}
For an O(nklogn) solution see my answer here: https://stackoverflow.com/a/28379003/2642059
Note that is for an increasing sequence and you're asking for a decreasing sequence.
To accomplish that you will need to reverse the ranking created by mapIndex. So simply reverse temp before creating mapIndex by swapping the partial_sort_copy line with this one:
partial_sort_copy(values.cbegin(), values.cend(), temp.rbegin(), temp.rend());

Insert values from one array to another directly sorted

So what I have to do is insert certain values from one array to another one directly sorted without having to sort them later using BubbleSort or QuickSort or any other method. I can't think of a way to do this... I have to insert them from the biggest value to the smallest one. Here's what I have until now:
void palindroame (int x[100], int y[100]) {
int i=0, j, k=0, aux;
while (x[i]!=0) {
k++; i++;
}
i=0;
for (i=0; i<=k-1; i++) y[i]=0;
for (i=0; i<=k-1; i++) {
if (palindrom(x[i])!=0 && palindrom(x[i+1])!=0)
if (x[i]<x[i+1]) {
aux=x[i+1]; x[i+1]=x[i]; x[i]=aux;
}
} //wrong
for (i=0; i<=k-1; i++) {
if (palindrom(x[i])) y[i]=x[i];
} //wrong
}
Thanks in advance!
The algorithm you need is selection sort, you can use this to sort and copy at the same time.
You can have a look at priority queues:
http://www.cplusplus.com/reference/queue/priority_queue/
Heres an example of a selection sort i have done recently (in which a is a vector)
should give you enough to go on hope it helps, ask questions if u like
for (unsigned int i = 0; i < a.size()-1; i++)
{
int min = i;
for(unsigned int j = i +1; j < a.size(); j++)
{
// If new minimum is found then stores this as the new minimum
if(a[j] < a[min])
{
min = j;
}
}
// Stores the values in the array in ascending order
if (min != i)
{
int temp = a[i];
a[i] = a[min];
a[min] = temp;
}
}
// Returns the array in ascending order
return a;
Edit: just to clarify this is working on a vector that already has values in it in case that wasnt clear but example with code comments i think is enough to help you IMO