I have a question about merge sort algorithm - c++

I've looked at the merge sort example code, but there's something I don't understand.
void mergesort(int left, int right)
{
if (left < right)
{
int sorted[LEN];
int mid, p1, p2, idx;
mid = (left + right) / 2;
mergesort(left, mid);
mergesort(mid + 1, right);
p1 = left;
p2 = mid + 1;
idx = left;
while (p1 <= mid && p2 <= right)
{
if (arr[p1] < arr[p2])
sorted[idx++] = arr[p1++];
else
sorted[idx++] = arr[p2++];
}
while (p1 <= mid)
sorted[idx++] = arr[p1++];
while (p2 <= right)
sorted[idx++] = arr[p2++];
for (int i = left; i <= right; i++)
arr[i] = sorted[i];
}
}
In this code, I don't know about a third while loop.
In detail, This code inserts p1, p2 in order into the 'sorted array'.
I want to know how this while loop creates an ascending array.
I would appreciate it if you could write your answer in detail so that I can understand it.

why the array is sorted in ascending order
Merge sort divides an array of n elements into n runs of 1 element each. Each of those single element runs can be considered to be sorted since they only contain a single element. Pairs of single element runs are merged to create sorted runs of 2 elements each. Pairs of 2 element runs are merged to create sorted runs of 4 elements each. The process continues until a sorted run equal the size of the original array is created.
The example in the question is a top down merge sort, that recursively splits the array in half until a base case of a single element run is reached. After this, merging follows the call chain, depth first left first. Most libraries use some variation of bottom up merge sort (along with insertion sort used to detect or create small sorted runs). With a bottom up merge sort, there's no recursive splitting, an array of n elements is treated as n runs of 1 element each, and starts merging even and odd runs, left to right, in a merge pass. After ceiling(log2(n)) passes, the array is sorted.
The example code has an issue, it allocates an entire array on the stack for each level of recursion which will result in stack overflow for large arrays. The Wiki examples are better, although the bottom up example should swap references rather than copy the array.
https://en.wikipedia.org/wiki/Merge_sort
For the question's code, might as well have sorted as a global array, or at least declared as static (a single instance):
static int arr[LEN];
static int sorted[LEN];
void mergesort(int left, int right)
/* ... */

I'm a developer working in the field.
I was surprised to see you embodying merge sort.
Before we start, the time complexity of the merge sort is O(nlogn).
The reason can be found in the merge sort process!
First, let's assume that there is an unordered array.
Merger sorting process:
Divide it into an array of 1 size by the number of size of the array.
Create an array that is twice the size of the divided array.
Compare the elements of the two divided arrays and put the smaller elements in order in the created array.
Repeat this process until it reaches the size of the original array.
merge sort img
There is a reason why the time complexity of the merge sort is O(nLogn).
In this process, the time complexity of log is obtained because the array is continuously divided by half, and the time complexity of nlogn is obtained because the process is performed by a total of n times.

Related

How can I merge, split and query k-th of sorted lists?

Initially I have n elements, they are in n tiles.
I need to support 3 kinds of queries:
merge two tiles into one tile.
split one tile into two tiles. (Formally for a tile of size k, split it into two tiles of size k1 and k2, k=k1+k2, the first tile contains the smallest k1 elements and the second tile contains the rest)
find the k-th smallest element in one tile.
Still assuming there are n queries. What worst-case time complexity can I achieve?
That will not be a complete answer, but some thoughts on what can be done.
My idea is based on skip list.
Let every tile be an indexable sorted skip list.
Splitting then rather simple: find k-th element and break every link between i > k1-th and j <= k1-th elements (there are at most O(log n) such links).
Merging is trickier.
First, assume that we can concatenate two skiplists in O(log n).
Lets say we are merging two tiles T1 and T2.
Compare the first elements t1 from T1 and t2 from T2. Let's
say t1 < t2
Then, find the last t1' still less than t2 in T1.
We must insert t2 right after t1'. But first, we are looking at the element t1* right after t1' in T1.
Now search for the last t2' still less than t1* in T2.
An entire sequence of elements from T2, starting at t2 and ending at t2', must be inserted between t1' and t1*.
So, we are doing split at t1' and t2', obtaining new lists T1a, T1b, T2a, T2b.
We concatenating T1a, T2a and T1b, obtaining the new list T1*.
We are repeating the entire process for the T1* and T2b.
In some pseudo-python-code:
#skiplist interface:
# split(list, k) - splits list after the k-th element, returns two lists
# concat(list1, list2) - concatenates two lists, returns the new one
# index(list, k) - returns k-th element from the list
# upper_bound(list, val) - returns the index of the last element less that val
# empty(list) - check if list is empty
def Query(tile, k)
return index(tile, k)
def Split(tile, k)
return split(tile, k)
def Merge(tile1, tile2):
if empty(tile1):
return tile2
if empty(tile2):
return tile1
t1 = index(tile1, 0)
t2 = index(tile2, 0)
if t1 < t2:
#(1)
i1 = upper_bound(tile1, t2)
t1s = index(tile1, i1 + 1)
i2 = upper_bound(tile2, t1s)
t1_head, t1_tail = split(tile1, i1)
t2_head, t2_tail = split(tile2, i2)
head = concat(t1_head, t2_head)
tail = Merge(t1_tail, t2_tail)
return concat(head, tail)
else:
#swap tile1, tile2, do (1)
There are at most O(p) such iterations, where p is the number of interleaved runs in T1 and T2. Every iteration takes O(log n) operations to complete.
As it was noted by #newbie, there is an example where the sum of ps equals to n log n.
This python script generates such an example for k = log_2 n (plus sign in the output stands for merge):
def f(l):
if len(l) == 2:
return "%s+%s" % (l[0], l[1])
if len(l) == 1:
return str(l[0])
l1 = [l[i] for i in xrange(0, len(l), 2)]
l2 = [l[i + 1] for i in xrange(0, len(l), 2)]
l_str = f(l1)
r_str = f(l2)
return "(%s)+(%s)" % (l_str, r_str)
def example(k):
print f(list(range(0, 2 ** k)))
For n = 16:
example(4)
Gives us the following queries:
(
(
(0+8)+(4+12)
)
+
(
(2+10)+(6+14)
)
)
+
(
(
(1+9)+(5+13)
)
+
(
(3+11)+(7+15)
)
)
Which is a binary tree where we are merging 2^(k-j) number of 2^j-sized tiles in the height j. Tiles are constructed in such a way that their elements are always interleaved, so for the tiles of size q we are doing O(q) splits-concatenations.
However, it still doesn't worsen the overall complexity of O(n log n) for this specific case, as (highly informally speaking) each split-concatenation of the 'small' lists costs less than O(log n) and there are much more 'small' lists than 'big'.
I'm not sure if there are worse counterexamples, but for now I think the overall worst case complexity for n queries is somewhere between n log^2 n and n log n.
Look for:
std::merge or std::set_union
std::partition
std::find (or std::find_if)
Linear complexity for 1 and 2.
Depends on your container for 3, linear at worst.
But it's not clear what you're asking exactly. Do you have some code we can look at ?
By the time I asked this question I don't know how to solve it, since it seems that it's okay to answer my own question, I gonna answer this question myself :/
First, let's suppose the values in the sorted lists are integers between 1~n. If not, you may just sort and map them.
Let's build a segment tree for every sorted list, segment trees are built based on values (1~n). In every node of a segment tree stores how many numbers are in this range, let's call this the value of a node.
It seems that it requires O(nlogn) space to store every segment tree, but we can simply drop the nodes that value=0, and really allocate these nodes only when their value become >0.
So for a sorted list with only one element, we simply build a chain of this value, so only O(logn) memory is needed.
int s[SZ]/*value of a node*/,
ch[SZ][2]/*a node's two children*/;
//make a seg with only node p, return in the first argument
//call with sth. like build(root,1,n,value);
void build(int& x,int l,int r,int p)
{
x=/*a new node*/; s[x]=1;
if(l==r) return;
int m=(l+r)>>1;
if(p<=m) build(ch[x][0],l,m,p);
else build(ch[x][1],m+1,r,p);
}
When we split a segment tree (sorted list), simply split two children recursively:
//make a new node t2, split t1 to t1 and t2 so that s[t1]=k
void split(int t1,int& t2,int k)
{
t2=/*a new node*/;
int ls=s[ch[t1][0]]; //size of t1's left child
if(k>ls) split(ch[t1][1],ch[t2][1],k-ls); //split the right child of t1
else swap(ch[t1][1],ch[t2][1]); //all right child belong to t2
if(k<ls) split(ch[t1][0],ch[t2][0],k); //split the left child of t1
s[t2]=s[t1]-k; s[t1]=k;
}
When we merge two sorted lists, merge them forcely:
//merge trees t1&t2, return merged segment tree
int merge(int t1,int t2)
{
if(t1&&t2);else return t1^t2; //nothing to merge
ch[t1][0]=merge(ch[t1][0],ch[t2][0]);
ch[t1][1]=merge(ch[t1][1],ch[t2][1]);
s[t1]+=s[t2]; /*erase t2, it's useless now*/ return t1;
}
It looks very simple, isn't it? But its total complexity is in fact O(nlogn).
Proof:
Let's investigate the total number of allocated segment tree nodes.
Initially we will allocate O(nlogn) such nodes (O(logn) for each).
For every spliting attempt we will allocate at most O(logn) more, so it will also be O(nlogn) in total. The reason is obviously we will recursively split only either left child or right child of a node.
So the total number of allocated segment tree nodes will at most be only O(nlogn).
Let's consider merging, except for 'nothing to merge', every time we call merge, the total number of allocated segment tree nodes will decrease by 1 (t2 isn't useful anymore). Obviously 'nothing to merge' will only be called when its father is really merged, so they will have nothing to do with complexity.
the total number of allocated segment tree nodes is O(nlogn), for every useful merge it will decrease 1, so the total complexity of all merges is O(nlogn).
Summing up, and we've got the result.
Querying k-th is also very simple, and we've done :)
//query k-th of segment tree x[l,r]
int ask(int x,int l,int r,int k)
{
if(l==r) return l;
int ls=s[ch[x][0]]; //how many nodes in left child
int m=(l+r)>>1;
if(k>ls) return ask(ch[x][1],m+1,r,k-ls);
return ask(ch[x][0],l,m,k);
}

Iterative Merge Sort, works same speed as Bubblesort

I have tried to implement an iterative Merge sort using nested loops. Although this algorithm does sort correctly (as in after sorting things are in correct order), I know there is something wrong with this implementation as I tried to sort larger collections with it and compare timings with slower sorts, and I end up getting slow times for this iterative implementation. For example, sorting 500 items gives a time of 31 milliseconds with this implementation just like bubble sort does.
int main()
{
int size;
cin >> size;
//assume vector is already initialized with values & size
vector<int> items(size);
IterativeMergeSort(items, 0, size - 1);
}
void IterativeMergeSort(vector<int> &items, int start, int end)
{
vector<int> temp(items.size());
int left, middle, right;
for(int outer = 1; outer < 2; outer *= 2)
{
for(int inner = start; inner < end; inner = inner * outer + 1)
{
left = outer - 1;
middle = inner;
right = inner + 1;
ItMerge(items, left, middle, right, temp);
}
}
}
void ItMerge(vector<int> &items, int start, int mid, int end, vector<int> &temp)
{
int first1 = start;
int last1 = mid;
int first2 = mid + 1;
int last2 = end;
int index = first1;
while(first1 <= last1 && first2 <= last2)
{
if(items[first1] <= items[first2])
{
temp[index] = items[first1];
first1++;
}
else
{
temp[index] = items[first2];
first2++;
}
index++;
}
while(first1 <= last1)
{
temp[index] = items[first1];
first1++;
index++;
}
while(first2 <= last2)
{
temp[index] = items[first2];
first2++;
index++;
}
for(index = start; index <= end; index++)
{
items[index] = temp[index];
}
}
Your algorithm isn't merge sort. It tries to be, but it isn't.
As I understand it, what is supposed to happen is that the inner loop steps over subsequences and merges them, while the outer loop controls the inner loop's sequence length, starting with 1 and doubling on every iteration until there are just two subsequences and they get merged.
But that's not what your algorithm is doing. The outer loop's condition is broken, so the outer loop will run exactly once. And the inner loop doesn't take roughly-equal subsequences in pairs. Instead, the right subsequence is exactly one element (mid is inner, right is inner+1) and the left subsequence is always everything used so far (left is outer-1, and outer is constant 1). So the algorithm will repeatedly merge the already-sorted left subsequence with a single-element right subsequence.
This means that in effect, your algorithm is insertion sort, except that you don't insert in place, but instead copy the sorted sequence to a buffer, inserting the new element at the right moment, then copy the result back. So it's a very inefficient insertion sort.
Below is a link to somewhat optimized examples of top down and bottom up merge sort. The bottom up merge sort is a bit faster because it skips the recursive sequence used to repeated generate sub-pairs of indexes until a sub-pair represents a run of size 1. Most of the time is spent merging, so bottom up isn't that much faster. The first pass of the bottom up merge pass could be optimized by swapping pairs in place rather than copying them. The bottom up merge sort ends up with the sorted data in either the temp or original array. If the original array is wanted, then a pass count can be calculated and if the count is odd, then the first pass swaps in place.
Both versions can sort 4 million 64 bit unsigned integers in less than a second on my system (Intel Core i7 2600k 3.4ghz).
merge_sort using vectors works well with less than 9 inputs
For a vector or array of integers, a counting / radix sort would be faster still.
I've finally figured it out.
In pseudocode:
for( outer = 1, outer < length, outer *=2)
for(inner = 0; inner < length, inner = inner + (outer *2))
left = inner
middle = (inner + outer) - 1
right = (inner + (outer * 2)) - 1
merge(items, left, middle, right, temp)
After rethinking how the iterative merge sort is supposed to work and looking at a couple implementations, in the merge method, all I needed was to check if the middle and right indexes passed in were greater than or equal to the vector size (that way we handle any values that could out of bounds), then merge as usual. Also, looking at this helped greatly understand it; also this. Just to be sure that it works as well as a recursive Merge Sort, I did timings on both and both (recursive and iterative) implementations produced identical times for 500,1000,5000, and 10K values to sort (in some cases the iterative solution produced a faster time).

Improvement of quick sort

Quick sort algorithm has bad behavior when there are many copy of items.(I mean we have Repetitive data).How could it be improved to this issue is resolved.
int partition (int low,int high)
{
int j=low,i=low+1;
int PivotItem=arr[low];
for(int i=0,j=0;i<n;i++)
{
if(arr[i]==PivotItem)
subarray[j]=arr[i];
}
for(i=low+1;i<=high;i++)
{
if(arr[i]<PivotItem)
{
j++;
swap(arr[i],arr[j]);
}
}
swap(arr[low],arr[j]);
int PivotPoint=j;
return PivotPoint;
}
void quick_sort(int low,int high)
{
if(low==high)
return ;
int PivotPoint=partition(low,high);
quick_sort(low,PivotPoint-1);
quick_sort(PivotPoint+1,high);
}
There is special modification of QuickSort known as dutch flag sort algorithm. It uses three-way partition for items smaller, equal and bigger than pivot item value.
I assume you meant the fact that quick sort compares elements based on a <= (or < and then the result is symmetrical to the next explanation) comparator, and if we look at the case where all elements are the same as the pivot x, you get quicksort's worst case complexity, since you split the array into two very non-even parts, one of size n-1, and the other is empty.
A quick fix to address this issue will be to use quick sort only with <, and > - to split the data to the two subarrays, and instead of a singular pivot, hold an array that holds all the elements that equal to the pivot, then recurse on the elements that are strictly larger than the pivot, and the elements that are strictly smaller than the pivot, and combine the three arrays.
Illustration:
legend: X=pivot, S = smaller than pivot, L = larger than pivot
array = |SLLLSLXLSLXSSLLXLLLSSXSSLLLSSXSSLLLXSSL|
Choose pivot - X
Create L array of only strictly smaller elements: |SSSSSSSSSSSSSSS|
Create R array of only strictly larger elements: |LLLLLLLLLLLLLLLLLL|
Create "pivot array" |XXXXXX|
Now, recurse on L, recurse on R, and combine:
|SSSSSSSSSSSSSSS XXXXXX LLLLLLLLLLLLLLLLLL|

choose n largest elements in two vector

I have two vectors, each contains n unsorted elements, how can I get n largest elements in these two vectors?
my solution is merge two vector into one with 2n elements, and then use std::nth_element algorithm, but I found that's not quite efficient, so anyone has more efficient solution. Really appreciate.
You may push the elements into priority_queue and then pop n elements out.
Assuming that n is far smaller than N this is quite efficient. Getting minElem is cheap and sorted inserting in L cheaper than sorting of the two vectors if n << N.
L := SortedList()
For Each element in any of the vectors do
{
minElem := smallest element in L
if( element >= minElem or if size of L < n)
{
add element to L
if( size of L > n )
{
remove smallest element from L
}
}
}
vector<T> heap;
heap.reserve(n + 1);
vector<T>::iterator left = leftVec.begin(), right = rightVec.begin();
for (int i = 0; i < n; i++) {
if (left != leftVec.end()) heap.push_back(*left++);
else if (right != rightVec.end()) heap.push_back(*right++);
}
if (left == leftVec.end() && right == rightVec.end()) return heap;
make_heap(heap.begin(), heap.end(), greater<T>());
while (left != leftVec.end()) {
heap.push_back(*left++);
push_heap(heap.begin(), heap.end(), greater<T>());
pop_heap(heap.begin(), heap.end(), greater<T>());
heap.pop_back();
}
/* ... repeat for right ... */
return heap;
Note I use *_heap directly rather than priority_queue because priority_queue does not provide access to its underlying data structure. This is O(N log n), slightly better than the naive O(N log N) method if n << N.
You can do the "n'th element" algorithm conceptually in parallel on the two vectors quite easiely (at least the simple variant that's only linear in the average case).
Pick a pivot.
Partition (std::partition) both vectors by that pivot. You'll have the first vector partitioned by some element with rank i and the second by some element with rank j. I'm assuming descending order here.
If i+j < n, recurse on the right side for the n-i-j greatest elements. If i+j > n, recurse on the left side for the n greatest elements. If you hit i+j==n, stop the recursion.
You basically just need to make sure to partition both vectors by the same pivot in every step. Given a decent pivot selection, this algorithm is linear in the average case (and works in-place).
See also: http://en.wikipedia.org/wiki/Selection_algorithm#Partition-based_general_selection_algorithm
Edit: (hopefully) clarified the algorithm a bit.

Top 10 Frequencies in a Hash Table with Linked Lists

The code below will print me the highest frequency it can find in my hash table (of which is a bunch of linked lists) 10 times. I need my code to print the top 10 frequencies in my hash table. I do not know how to do this (code examples would be great, plain english logic/pseudocode is just as great).
I create a temporary hashing list called 'tmp' which is pointing to my hash table 'hashtable'
A while loop then goes through the list and looks for the highest frequency, which is an int 'tmp->freq'
The loop will continue this process of duplicating the highest frequency it finds with the variable 'topfreq' until it reaches the end of the linked lists on the the hash table.
My 'node' is a struct comprising of the variables 'freq' (int) and 'word' (128 char). When the loop has nothing else to search for it prints these two values on screen.
The problem is, I can't wrap my head around figuring out how to find the next lowest number from the number I've just found (and this can include another node with the same freq value, so I have to check that the word is not the same too).
void toptenwords()
{
int topfreq = 0;
int minfreq = 0;
char topword[SIZEOFWORD];
for(int p = 0; p < 10; p++) // We need the top 10 frequencies... so we do this 10 times
{
for(int m = 0; m < HASHTABLESIZE; m++) // Go through the entire hast table
{
node* tmp;
tmp = hashtable[m];
while(tmp != NULL) // Walk through the entire linked list
{
if(tmp->freq > topfreq) // If the freqency on hand is larger that the one found, store...
{
topfreq = tmp->freq;
strcpy(topword, tmp->word);
}
tmp = tmp->next;
}
}
cout << topfreq << "\t" << topword << endl;
}
}
Any and all help would be GREATLY appreciated :)
Keep an array of 10 node pointers, and insert each node into the array, maintaining the array in sorted order. The eleventh node in the array is overwritten on each iteration and contains junk.
void toptenwords()
{
int topfreq = 0;
int minfreq = 0;
node *topwords[11];
int current_topwords = 0;
for(int m = 0; m < HASHTABLESIZE; m++) // Go through the entire hast table
{
node* tmp;
tmp = hashtable[m];
while(tmp != NULL) // Walk through the entire linked list
{
topwords[current_topwords] = tmp;
current_topwords++;
for(int i = current_topwords - 1; i > 0; i--)
{
if(topwords[i]->freq > topwords[i - 1]->freq)
{
node *temp = topwords[i - 1];
topwords[i - 1] = topwords[i];
topwords[i] = temp;
}
else break;
}
if(current_topwords > 10) current_topwords = 10;
tmp = tmp->next;
}
}
}
I would maintain a set of words already used and change the inner-most if condition to test for frequency greater than previous top frequency AND tmp->word not in list of words already used.
When iterating over the hash table (and then over each linked list contained therein) keep a self balancing binary tree (std::set) as a "result" list. As you come across each frequency, insert it into the list, then truncate the list if it has more than 10 entries. When you finish, you'll have a set (sorted list) of the top ten frequencies, which you can manipulate as you desire.
There may be perform gains to be had by using sets instead of linked lists in the hash table itself, but you can work that out for yourself.
Step 1 (Inefficient):
Move the vector into a sorted container via insertion sort, but insert into a container (e.g. linkedlist or vector) of size 10, and drop any elements that fall off the bottom of the list.
Step 2 (Efficient):
Same as step 1, but keep track of the size of the item at the bottom of the list, and skip the insertion step entirely if the current item is too small.
Suppose there are n words in total, and we need the most-frequent k words (here, k = 10).
If n is much larger than k, the most efficient way I know of is to maintain a min-heap (i.e. the top element has the minimum frequency of all elements in the heap). On each iteration, you insert the next frequency into the heap, and if the heap now contains k+1 elements, you remove the smallest. This way, the heap is maintained at a size of k elements throughout, containing at any time the k highest-frequency elements seen so far. At the end of processing, read out the k highest-frequency elements in increasing order.
Time complexity: For each of n words, we do two things: insert into a heap of size at most k, and remove the minimum element. Each operation costs O(log k) time, so the entire loop takes O(nlog k) time. Finally, we read out the k elements from a heap of size at most k, taking O(klog k) time, for a total time of O((n+k)log k). Since we know that k < n, O(klog k) is at worst O(nlog k), so this can be simplified to just O(nlog k).
A hash table containing linked lists of words seems like a peculiar data structure to use if the goal is to accumulate are word frequencies.
Nonetheless, the efficient way to get the ten highest frequency nodes is to insert each into a priority queue/heap, such as the Fibonacci heap, which has O(1) insertion time and O(n) deletion time. Assuming that iteration over the hash table table is fast, this method has a runtime which is O(n×O(1) + 10×O(n)) ≡ O(n).
The absolute fastest way to do this would be to use a SoftHeap. Using a SoftHeap, you can find the top 10 items in O(n) time whereas every other solution posted here would take O(n lg n) time.
http://en.wikipedia.org/wiki/Soft_heap
This wikipedia article shows how to find the median in O(n) time using a softheap, and the top 10 is simply a subset of the median problem. You could then sort the items that were in the top 10 if you needed them in order, and since you're always at most sorting 10 items, it's still O(n) time.