How to implement a minimum heap sort to find the kth smallest element?

How to implement a minimum heap sort to find the kth smallest element? - c++

I've been implementing selection sort problems for class and one of the assignments is to find the kth smallest element in the array using a minimum heap. I know the procedure is:
heapify the array
delete the minimum (root) k times
return kth smallest element in the group
I don't have any problems creating a minimum heap. I'm just not sure how to go about properly deleting the minimum k times and successfully return the kth smallest element in the group. Here's what I have so far:
bool Example::min_heap_select(long k, long & kth_smallest) const {
//duplicate test group (thanks, const!)
Example test = Example(*this);
//variable delcaration and initlization
int n = test._total ;
int i;
//Heapifying stage (THIS WORKS CORRECTLY)
for (i = n/2; i >= 0; i--) {
//allows for heap construction
test.percolate_down_protected(i, n);
}//for
//Delete min phase (THIS DOESN'T WORK)
for(i = n-1; i >= (n-k+1); i--) {
//deletes the min by swapping elements
int tmp = test._group[0];
test._group[0] = test._group[i];
test._group[i] = tmp;
//resumes perc down
test.percolate_down_protected(0, i);
}//for
//IDK WHAT TO RETURN
kth_smallest = test._group[0];
void Example::percolate_down_protected(long i, long n) {
//variable declaration and initlization:
int currPos, child, r_child, tmp;
currPos = i;
tmp = _group[i];
child = left_child(i);
//set a sentinel and begin loop (no recursion allowed)
while (child < n) {
//calculates the right child's position
r_child = child + 1;
//we'll set the child to index of greater than right and left children
if ((r_child > n ) && (_group[r_child] >= _group[child])) {
child = r_child;
}
//find the correct spot
if (tmp <= _group [child]) {
break;
}
//make sure the smaller child is beneath the parent
_group[currPos] = _group[child];
//shift the tree down
currPos = child;
child = left_child(currPos);
}
//put tmp where it belongs
_group[currPos] = tmp;
}
As I stated before, the minimum heap part works correctly. I understand what I what to do- it seems easy to delete the root k times but then after that what index in the array do I return... 0? This almost works- it doesn't worth with k = n or k = 1.Would the kth smallest element be in the Any help would be much appreciated!

The only array index which is meaningful to the user is zero, which is the minimum element. So, after removing k elements, the k'th smallest element will be at zero.
Probably you should destroy the heap and return the value rather than asking the user to concern themself with the heap itself… but I don't know the details of the assignment.
Note that the C++ Standard Library has algorithms to help with this: make_heap, pop_heap, and nth_element.

I am not providing a detailed answer, just explaining the key points in getting k smallest elements in a min-heap ordered tree. The approach uses skip lists.
First form a skip list of nodes of the tree with just one element the node corresponding to the root of the heap. the 1st minimum element is just the value stored at this node.
Now delete this node and insert its child nodes in the right position such that to maintain the order of values. This steps takes O(logk) time.
The second minimum value is just then the value at first node in this skip list.
Repeat the above steps until you get all the k minimum elements. The overall time complexity will be log(2)+log(3)+log(4)+... log(k) = O(k.logk). Forming a heap takes time n, so overall time complexity is O(n+klogk).
There is one more approach without making a heap that is Quickselect, which has an average time complexity of O(n) but worst case as O(n^2).
The striking difference between the two approaches is that the first approach gives all the k elements the minimum upto the kth minimum, while quickSelect gives only the kth minimum element.
Memory wise the former approach uses O(n) extra space which quickSelect uses O(1).

Related

Order Notation of pop-max in a binary heap

I need to write a pop_max function for a binary heap that removes the max element. The solution given is as below:
void pop_max() {
assert(!m_heap.empty());
int tmp = (size()+1)/2;
for (int i = tmp+1; i < size(); i++) {
if (m_heap[tmp] < m_heap[i])
tmp = i;
}
m_heap[tmp] = m_heap.back();
m_heap.pop_back();
this->percolate_up(tmp);
}
The solution also says the number of "nodes" visited is n+log(n) where n is the total number of nodes in the heap. It then goes on to say the running time is o(n).
This makes zero sense to me though.
Their solution finds the first leaf node int tmp = (size()+1)/2; then goes through the remaining leaf nodes.
Is their solution not n/2 nodes visited and o(n/2) for running time as well? Could someone explain why this might be?
Edit: O(n/2) = O(n). But what about the number of nodes visited? I still don't quite understand how it is o(n+log(n))

O(n/2) is equal to O(n)
Number of nodes visited means n for the leaf nodes and log(n) for percolate up!

Array balancing point

What is the best way to solve this?
A balancing point of an N-element array A is an index i such that all elements on lower indexes have values <= A[i] and all elements on higher indexes have values higher or equal A[i].
For example, given:
A[0]=4 A[1]=2 A[2]=7 A[3]=11 A[4]=9
one of the correct solutions is: 2. All elements below A[2] is less than A[2], all elements after A[2] is more than A[2].
One solution that appeared to my mind is O(nsquare) solution. Is there any better solution?

Start by assuming A[0] is a pole. Then start walking the array; comparing each element A[i] in turn against A[0], and also tracking the current maximum.
As soon as you find an i such that A[i] < A[0], you know that A[0] can no longer be a pole, and by extension, neither can any of the elements up to and including A[i]. So now continue walking until you find the next value that's bigger than the current maximum. This then becomes the new proposed pole.
Thus, an O(n) solution!
In code:
int i_pole = 0;
int i_max = 0;
bool have_pole = true;
for (int i = 1; i < N; i++)
{
if (A[i] < A[i_pole])
{
have_pole = false;
}
if (A[i] > A[i_max])
{
i_max = i;
if (!have_pole)
{
i_pole = i;
}
have_pole = true;
}
}

If you want to know where all the poles are, an O(n log n) solution would be to create a sorted copy of the array, and look to see where you get matching values.
EDIT: Sorry, but this doesn't actually work. One counterexample is [2, 5, 3, 1, 4].

Make two auxiliary arrays, each with as many elements as the input array, called MIN and MAX.
Each element M of MAX contains the maximum of all the elements in the input from 0..M. Each element M of MIN contains the minimum of all the elements in the input from M..N-1.
For each element M of the input array, compare its value to the corresponding values in MIN and MAX. If INPUT[M] == MIN[M] and INPUT[M] == MAX[M] then M is a balancing point.
Building MIN takes N steps, and so does MAX. Testing the array then takes N more steps. This solution has O(N) complexity and finds all balancing points. In the case of sorted input every element is a balancing point.

Create a double-linked list such as i-th node of this list contains A[i] and i. Traverse this list while elements grow (counting maximum of these elements). If some A[bad] < maxSoFar it can't be MP. Remove it and go backward removing elements until you find A[good] < A[bad] or reach the head of the list. Continue (starting with maxSoFar as maximum) until you reach end of the list. Every element in result list is MP and every MP is in this list. Complexity is O(n) since is maximum of steps is performed for descending array - n steps forward and n removals.
Update
Oh my, I confused "any" with "every" in problem definition :).

You can combine bmcnett's and Oli's answers to find all the poles as quickly as possible.
std::vector<int> i_poles;
i_poles.push_back(0);
int i_max = 0;
for (int i = 1; i < N; i++)
{
while (!i_poles.empty() && A[i] < A[i_poles.back()])
{
i_poles.pop_back();
}
if (A[i] >= A[i_max])
{
i_poles.push_back(i);
}
}
You could use an array preallocated to size N if you wanted to avoid reallocations.

Randomly permute N first elements of a singly linked list

I have to permute N first elements of a singly linked list of length n, randomly. Each element is defined as:
typedef struct E_s
{
struct E_s *next;
}E_t;
I have a root element and I can traverse the whole linked list of size n. What is the most efficient technique to permute only N first elements (starting from root) randomly?
So, given a->b->c->d->e->f->...x->y->z I need to make smth. like f->a->e->c->b->...x->y->z
My specific case:
n-N is about 20% relative to n
I have limited RAM resources, the best algorithm should make it in place
I have to do it in a loop, in many iterations, so the speed does matter
The ideal randomness (uniform distribution) is not required, it's Ok if it's "almost" random
Before making permutations, I traverse the N elements already (for other needs), so maybe I could use this for permutations as well
UPDATE: I found this paper. It states it presents an algorithm of O(log n) stack space and expected O(n log n) time.

I've not tried it, but you could use a "randomized merge-sort".
To be more precise, you randomize the merge-routine. You do not merge the two sub-lists systematically, but you do it based on a coin toss (i.e. with probability 0.5 you select the first element of the first sublist, with probability 0.5 you select the first element of the right sublist).
This should run in O(n log n) and use O(1) space (if properly implemented).
Below you find a sample implementation in C you might adapt to your needs. Note that this implementation uses randomisation at two places: In splitList and in merge. However, you might choose just one of these two places. I'm not sure if the distribution is random (I'm almost sure it is not), but some test cases yielded decent results.
#include <stdio.h>
#include <stdlib.h>
#define N 40
typedef struct _node{
int value;
struct _node *next;
} node;
void splitList(node *x, node **leftList, node **rightList){
int lr=0; // left-right-list-indicator
*leftList = 0;
*rightList = 0;
while (x){
node *xx = x->next;
lr=rand()%2;
if (lr==0){
x->next = *leftList;
*leftList = x;
}
else {
x->next = *rightList;
*rightList = x;
}
x=xx;
lr=(lr+1)%2;
}
}
void merge(node *left, node *right, node **result){
*result = 0;
while (left || right){
if (!left){
node *xx = right;
while (right->next){
right = right->next;
}
right->next = *result;
*result = xx;
return;
}
if (!right){
node *xx = left;
while (left->next){
left = left->next;
}
left->next = *result;
*result = xx;
return;
}
if (rand()%2==0){
node *xx = right->next;
right->next = *result;
*result = right;
right = xx;
}
else {
node *xx = left->next;
left->next = *result;
*result = left;
left = xx;
}
}
}
void mergeRandomize(node **x){
if ((!*x) || !(*x)->next){
return;
}
node *left;
node *right;
splitList(*x, &left, &right);
mergeRandomize(&left);
mergeRandomize(&right);
merge(left, right, &*x);
}
int main(int argc, char *argv[]) {
srand(time(NULL));
printf("Original Linked List\n");
int i;
node *x = (node*)malloc(sizeof(node));;
node *root=x;
x->value=0;
for(i=1; i<N; ++i){
node *xx;
xx = (node*)malloc(sizeof(node));
xx->value=i;
xx->next=0;
x->next = xx;
x = xx;
}
x=root;
do {
printf ("%d, ", x->value);
x=x->next;
} while (x);
x = root;
node *left, *right;
mergeRandomize(&x);
if (!x){
printf ("Error.\n");
return -1;
}
printf ("\nNow randomized:\n");
do {
printf ("%d, ", x->value);
x=x->next;
} while (x);
printf ("\n");
return 0;
}

Convert to an array, use a Fisher-Yates shuffle, and convert back to a list.

I don't believe there's any efficient way to randomly shuffle singly-linked lists without an intermediate data structure. I'd just read the first N elements into an array, perform a Fisher-Yates shuffle, then reconstruct those first N elements into the singly-linked list.

First, get the length of the list and the last element. You say you already do a traversal before randomization, that would be a good time.
Then, turn it into a circular list by linking the first element to the last element. Get four pointers into the list by dividing the size by four and iterating through it for a second pass. (These pointers could also be obtained from the previous pass by incrementing once, twice, and three times per four iterations in the previous traversal.)
For the randomization pass, traverse again and swap pointers 0 and 2 and pointers 1 and 3 with 50% probability. (Do either both swap operations or neither; just one swap will split the list in two.)
Here is some example code. It looks like it could be a little more random, but I suppose a few more passes could do the trick. Anyway, analyzing the algorithm is more difficult than writing it :vP . Apologies for the lack of indentation; I just punched it into ideone in the browser.
http://ideone.com/9I7mx
#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;
struct list_node {
int v;
list_node *n;
list_node( int inv, list_node *inn )
: v( inv ), n( inn) {}
};
int main() {
srand( time(0) );
// initialize the list and 4 pointers at even intervals
list_node *n_first = new list_node( 0, 0 ), *n = n_first;
list_node *p[4];
p[0] = n_first;
for ( int i = 1; i < 20; ++ i ) {
n = new list_node( i, n );
if ( i % (20/4) == 0 ) p[ i / (20/4) ] = n;
}
// intervals must be coprime to list length!
p[2] = p[2]->n;
p[3] = p[3]->n;
// turn it into a circular list
n_first->n = n;
// swap the pointers around to reshape the circular list
// one swap cuts a circular list in two, or joins two circular lists
// so perform one cut and one join, effectively reordering elements.
for ( int i = 0; i < 20; ++ i ) {
list_node *p_old[4];
copy( p, p + 4, p_old );
p[0] = p[0]->n;
p[1] = p[1]->n;
p[2] = p[2]->n;
p[3] = p[3]->n;
if ( rand() % 2 ) {
swap( p_old[0]->n, p_old[2]->n );
swap( p_old[1]->n, p_old[3]->n );
}
}
// you might want to turn it back into a NULL-terminated list
// print results
for ( int i = 0; i < 20; ++ i ) {
cout << n->v << ", ";
n = n->n;
}
cout << '\n';
}

For the case when N is really big (so it doesn't fit your memory), you can do the following (a sort of Knuth's 3.4.2P):
j = N
k = random between 1 and j
traverse the input list, find k-th item and output it; remove the said item from the sequence (or mark it somehow so that you won't consider it at the next traversal)
decrease j and return to 2 unless j==0
output the rest of the list
Beware that this is O(N^2), unless you can ensure random access in the step 3.
In case the N is relatively small, so that N items fit into the memory, just load them into array and shuffle, like #Mitch proposes.

If you know both N and n, I think you can do it simply. It's fully random, too. You only iterate through the whole list once, and through the randomized part each time you add a node. I think that's O(n+NlogN) or O(n+N^2). I'm not sure. It's based upon updating the conditional probability that a node is selected for the random portion given what happened to previous nodes.
Determine the probability that a certain node will be selected for the random portion given what happened to previous nodes (p=(N-size)/(n-position) where size is number of nodes previously chosen and position is number of nodes previously considered)
If node is not selected for random part, move to step 4. If node is selected for the random part, randomly choose place in random part based upon the size so far (place=(random between 0 and 1) * size, size is again number of previous nodes).
Place the node where it needs to go, update the pointers. Increment size. Change to looking at the node that previously pointed at what you were just looking at and moved.
Increment position, look at the next node.
I don't know C, but I can give you the pseudocode. In this, I refer to the permutation as the first elements that are randomized.
integer size=0; //size of permutation
integer position=0 //number of nodes you've traversed so far
Node head=head of linked list //this holds the node at the head of your linked list.
Node current_node=head //Starting at head, you'll move this down the list to check each node, whether you put it in the list.
Node previous=head //stores the previous node for changing pointers. starts at head to avoid asking for the next field on a null node
While ((size not equal to N) or (current_node is not null)){ //iterating through the list until the permutation is full. We should never pass the end of list, but just in case, I include that condition)
pperm=(N-size)/(n-position) //probability that a selected node will be in the permutation.
if ([generate a random decimal between 0 and 1] < pperm) //this decides whether or not the current node will go in the permutation
if (j is not equal to 0){ //in case we are at start of list, there's no need to change the list
pfirst=1/(size+1) //probability that, if you select a node to be in the permutation, that it will be first. Since the permutation has
//zero elements at start, adding an element will make it the initial node of a permutation and percent chance=1.
integer place_in_permutation = round down([generate a random decimal between 0 and 1]/pfirst) //place in the permutation. note that the head =0.
previous.next=current_node.next
if(place_in_permutation==0){ //if placing current node first, must change the head
current_node.next=head //set the current Node to point to the previous head
head=current_node //set the variable head to point to the current node
}
else{
Node temp=head
for (counter starts at zero. counter is less than place_in_permutation-1. Each iteration, increment counter){
counter=counter.next
} //at this time, temp should point to the node right before the insertion spot
current_node.next=temp.next
temp.next=current_node
}
current_node=previous
}
size++ //since we add one to the permutation, increase the size of the permutation
}
j++;
previous=current_node
current_node=current_node.next
}
You could probably increase the efficiency if you held on to the most recently added node in case you had to add one to the right of it.

Similar to Vlad's answer, here is a slight improvement (statistically):
Indices in algorithm are 1 based.
Initialize lastR = -1
If N <= 1 go to step 6.
Randomize number r between 1 and N.
if r != N
4.1 Traverse the list to item r and its predecessor.
If lastR != -1
If r == lastR, your pointer for the of the r'th item predecessor is still there.
If r < lastR, traverse to it from the beginning of the list.
If r > lastR, traverse to it from the predecessor of the lastR'th item.
4.2 remove the r'th item from the list into a result list as the tail.
4.3 lastR = r
Decrease N by one and go to step 2.
link the tail of the result list to the head of the remaining input list. You now have the original list with the first N items permutated.
Since you do not have random access, this will reduce the traversing time you will need within the list (I assume that by half, so asymptotically, you won't gain anything).

O(NlogN) easy to implement solution that does not require extra storage:
Say you want to randomize L:
is L has 1 or 0 elements you are done
create two empty lists L1 and L2
loop over L destructively moving its elements to L1 or L2 choosing between the two at random.
repeat the process for L1 and L2 (recurse!)
join L1 and L2 into L3
return L3
Update
At step 3, L should be divided into equal sized (+-1) lists L1 and L2 in order to guaranty best case complexity (N*log N). That can be done adjusting the probability of one element going into L1 or L2 dynamically:
p(insert element into L1) = (1/2 * len0(L) - len(L1)) / len(L)
where
len(M) is the current number of elements in list M
len0(L) is the number of elements there was in L at the beginning of step 3

There is an algorithm takes O(sqrt(N)) space and O(N) time, for a singly linked list.
It does not generate a uniform distribution over all permutation sequence, but it can gives good permutation that is not easily distinguishable. The basic idea is similar to permute a matrix by rows and columns as described below.
Algorithm
Let the size of the elements to be N, and m = floor(sqrt(N)). Assuming a "square matrix" N = m*m will make this method much clear.
In the first pass, you should store the pointers of elements that is separated by every m elements as p_0, p_1, p_2, ..., p_m. That is, p_0->next->...->next(m times) == p_1 should be true.
Permute each row
For i = 0 to m do:
Index all elements between p_i->next to p_(i+1)->next in the link list by an array of size O(m)
Shuffle this array using standard method
Relink the elements using this shuffled array
Permute each column.
Initialize an array A to store pointers p_0, ..., p_m. It is used to traverse the columns
For i = 0 to m do
Index all elements pointed A[0], A[1], ..., A[m-1] in the link list by an array of size m
Shuffle this array
Relink the elements using this shuffled array
Advance the pointer to next column A[i] := A[i]->next
Note that p_0 is an element point to the first element and the p_m point to the last element. Also, if N != m*m, you may use m+1 separation for some p_i instead. Now you get a "matrix" such that the p_i point to the start of each row.
Analysis and randomness
Space complexity: This algorithm need O(m) space to store the start of row. O(m) space to store the array and O(m) space to store the extra pointer during column permutation. Hence, time complexity is ~ O(3*sqrt(N)). For N = 1000000, it is around 3000 entries and 12 kB memory.
Time complexity: It is obviously O(N). It either walk through the "matrix" row by row or column by column
Randomness: The first thing to note is that each element can go to anywhere in the matrix by row and column permutation. It is very important that elements can go to anywhere in the linked list. Second, though it does not generate all permutation sequence, it does generate part of them. To find the number of permutation, we assume N=m*m, each row permutation has m! and there is m row, so we have (m!)^m. If column permutation is also include, it is exactly equal to (m!)^(2*m), so it is almost impossible to get the same sequence.
It is highly recommended to repeat the second and third step by at least one more time to get an more random sequence. Because it can suppress almost all the row and column correlation to its original location. It is also important when your list is not "square". Depends on your need, you may want to use even more repetition. The more repetition you use, the more permutation it can be and the more random it is. I remember that it is possible to generate uniform distribution for N=9 and I guess that it is possible to prove that as repetition tends to infinity, it is the same as the true uniform distribution.
Edit: The time and space complexity is tight bound and is almost the same in any situation. I think this space consumption can satisfy your need. If you have any doubt, you may try it in a small list and I think you will find it useful.

The list randomizer below has complexity O(N*log N) and O(1) memory usage.
It is based on the recursive algorithm described on my other post modified to be iterative instead of recursive in order to eliminate the O(logN) memory usage.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
typedef struct node {
struct node *next;
char *str;
} node;
unsigned int
next_power_of_two(unsigned int v) {
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
return v + 1;
}
void
dump_list(node *l) {
printf("list:");
for (; l; l = l->next) printf(" %s", l->str);
printf("\n");
}
node *
array_to_list(unsigned int len, char *str[]) {
unsigned int i;
node *list;
node **last = &list;
for (i = 0; i < len; i++) {
node *n = malloc(sizeof(node));
n->str = str[i];
*last = n;
last = &n->next;
}
*last = NULL;
return list;
}
node **
reorder_list(node **last, unsigned int po2, unsigned int len) {
node *l = *last;
node **last_a = last;
node *b = NULL;
node **last_b = &b;
unsigned int len_a = 0;
unsigned int i;
for (i = len; i; i--) {
double pa = (1.0 + RAND_MAX) * (po2 - len_a) / i;
unsigned int r = rand();
if (r < pa) {
*last_a = l;
last_a = &l->next;
len_a++;
}
else {
*last_b = l;
last_b = &l->next;
}
l = l->next;
}
*last_b = l;
*last_a = b;
return last_b;
}
unsigned int
min(unsigned int a, unsigned int b) {
return (a > b ? b : a);
}
randomize_list(node **l, unsigned int len) {
unsigned int po2 = next_power_of_two(len);
for (; po2 > 1; po2 >>= 1) {
unsigned int j;
node **last = l;
for (j = 0; j < len; j += po2)
last = reorder_list(last, po2 >> 1, min(po2, len - j));
}
}
int
main(int len, char *str[]) {
if (len > 1) {
node *l;
len--; str++; /* skip program name */
l = array_to_list(len, str);
randomize_list(&l, len);
dump_list(l);
}
return 0;
}
/* try as: a.out list of words foo bar doz li 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
*/
Note that this version of the algorithm is completely cache unfriendly, the recursive version would probably perform much better!

If both the following conditions are true:
you have plenty of program memory (many embedded hardwares execute directly from flash);
your solution does not suffer that your "randomness" repeats often,
Then you can choose a sufficiently large set of specific permutations, defined at programming time, write a code to write the code that implements each, and then iterate over them at runtime.

Finding smallest value in an array most efficiently

There are N values in the array, and one of them is the smallest value. How can I find the smallest value most efficiently?

If they are unsorted, you can't do much but look at each one, which is O(N), and when you're done you'll know the minimum.
Pseudo-code:
small = <biggest value> // such as std::numerical_limits<int>::max
for each element in array:
if (element < small)
small = element
A better way reminded by Ben to me was to just initialize small with the first element:
small = element[0]
for each element in array, starting from 1 (not 0):
if (element < small)
small = element
The above is wrapped in the algorithm header as std::min_element.
If you can keep your array sorted as items are added, then finding it will be O(1), since you can keep the smallest at front.
That's as good as it gets with arrays.

You need too loop through the array, remembering the smallest value you've seen so far. Like this:
int smallest = INT_MAX;
for (int i = 0; i < array_length; i++) {
if (array[i] < smallest) {
smallest = array[i];
}
}

The stl contains a bunch of methods that should be used dependent to the problem.
std::find
std::find_if
std::count
std::find
std::binary_search
std::equal_range
std::lower_bound
std::upper_bound
Now it contains on your data what algorithm to use.
This Artikel contains a perfect table to help choosing the right algorithm.
In the special case where min max should be determined and you are using std::vector or ???* array
std::min_element
std::max_element
can be used.

If you want to be really efficient and you have enough time to spent, use SIMD instruction.
You can compare several pairs in one instruction:
r0 := min(a0, b0)
r1 := min(a1, b1)
r2 := min(a2, b2)
r3 := min(a3, b3)
__m64 _mm_min_pu8(__m64 a , __m64 b );
Today every computer supports it. Other already have written min function for you:
http://smartdata.usbid.com/datasheets/usbid/2001/2001-q1/i_minmax.pdf
or use already ready library.

If the array is sorted in ascending or descending order then you can find it with complexity O(1).
For an array of ascending order the first element is the smallest element, you can get it by arr[0] (0 based indexing).
If the array is sorted in descending order then the last element is the smallest element,you can get it by arr[sizeOfArray-1].
If the array is not sorted then you have to iterate over the array to get the smallest element.In this case time complexity is O(n), here n is the size of array.
int arr[] = {5,7,9,0,-3,2,3,4,56,-7};
int smallest_element=arr[0] //let, first element is the smallest one
for(int i =1;i<sizeOfArray;i++)
{
if(arr[i]<smallest_element)
{
smallest_element=arr[i];
}
}
You can calculate it in input section (when you have to find smallest element from a given array)
int smallest_element;
int arr[100],n;
cin>>n;
for(int i = 0;i<n;i++)
{
cin>>arr[i];
if(i==0)
{
smallest_element=arr[i]; //smallest_element=arr[0];
}
else if(arr[i]<smallest_element)
{
smallest_element = arr[i];
}
}
Also you can get smallest element by built in function
#inclue<algorithm>
int smallest_element = *min_element(arr,arr+n); //here n is the size of array
You can get smallest element of any range by using this function
such as,
int arr[] = {3,2,1,-1,-2,-3};
cout<<*min_element(arr,arr+3); //this will print 1,smallest element of first three element
cout<<*min_element(arr+2,arr+5); // -2, smallest element between third and fifth element (inclusive)
I have used asterisk (*), before min_element() function. Because it returns pointer of smallest element.
All codes are in c++.
You can find the maximum element in opposite way.

Richie's answer is close. It depends upon the language. Here is a good solution for java:
int smallest = Integer.MAX_VALUE;
int array[]; // Assume it is filled.
int array_length = array.length;
for (int i = array_length - 1; i >= 0; i--) {
if (array[i] < smallest) {
smallest = array[i];
}
}
I go through the array in reverse order, because comparing "i" to "array_length" in the loop comparison requires a fetch and a comparison (two operations), whereas comparing "i" to "0" is a single JVM bytecode operation. If the work being done in the loop is negligible, then the loop comparison consumes a sizable fraction of the time.
Of course, others pointed out that encapsulating the array and controlling inserts will help. If getting the minimum was ALL you needed, keeping the list in sorted order is not necessary. Just keep an instance variable that holds the smallest inserted so far, and compare it to each value as it is added to the array. (Of course, this fails if you remove elements. In that case, if you remove the current lowest value, you need to do a scan of the entire array to find the new lowest value.)

An O(1) sollution might be to just guess: The smallest number in your array will often be 0. 0 crops up everywhere. Given that you are only looking at unsigned numbers. But even then: 0 is good enough. Also, looking through all elements for the smallest number is a real pain. Why not just use 0? It could actually be the correct result!
If the interviewer/your teacher doesn't like that answer, try 1, 2 or 3. They also end up being in most homework/interview-scenario numeric arrays...
On a more serious side: How often will you need to perform this operation on the array? Because the sollutions above are all O(n). If you want to do that m times to a list you will be adding new elements to all the time, why not pay some time up front and create a heap? Then finding the smallest element can really be done in O(1), without resulting to cheating.

If finding the minimum is a one time thing, just iterate through the list and find the minimum.
If finding the minimum is a very common thing and you only need to operate on the minimum, use a Heap data structure.
A heap will be faster than doing a sort on the list but the tradeoff is you can only find the minimum.

If you're developing some kind of your own array abstraction, you can get O(1) if you store smallest added value in additional attribute and compare it every time a new item is put into array.
It should look something like this:
class MyArray
{
public:
MyArray() : m_minValue(INT_MAX) {}
void add(int newValue)
{
if (newValue < m_minValue) m_minValue = newValue;
list.push_back( newValue );
}
int min()
{
return m_minValue;
}
private:
int m_minValue;
std::list m_list;
}

//find the min in an array list of #s
$array = array(45,545,134,6735,545,23,434);
$smallest = $array[0];
for($i=1; $i<count($array); $i++){
if($array[$i] < $smallest){
echo $array[$i];
}
}

//smalest number in the array//
double small = x[0];
for(t=0;t<x[t];t++)
{
if(x[t]<small)
{
small=x[t];
}
}
printf("\nThe smallest number is %0.2lf \n",small);

Procedure:
We can use min_element(array, array+size) function . But it iterator
that return the address of minimum element . If we use *min_element(array, array+size) then it will return the minimum value of array.
C++ implementation
#include<bits/stdc++.h>
using namespace std;
int main()
{
int num;
cin>>num;
int arr[10];
for(int i=0; i<num; i++)
{
cin>>arr[i];
}
cout<<*min_element(arr,arr+num)<<endl;
return 0;
}

int small=a[0];
for (int x: a.length)
{
if(a[x]<small)
small=a[x];
}

C++ code
#include <iostream>
using namespace std;
int main() {
int n = 5;
int arr[n] = {12,4,15,6,2};
int min = arr[0];
for (int i=1;i<n;i++){
if (min>arr[i]){
min = arr[i];
}
}
cout << min;
return 0;
}

Top 10 Frequencies in a Hash Table with Linked Lists

The code below will print me the highest frequency it can find in my hash table (of which is a bunch of linked lists) 10 times. I need my code to print the top 10 frequencies in my hash table. I do not know how to do this (code examples would be great, plain english logic/pseudocode is just as great).
I create a temporary hashing list called 'tmp' which is pointing to my hash table 'hashtable'
A while loop then goes through the list and looks for the highest frequency, which is an int 'tmp->freq'
The loop will continue this process of duplicating the highest frequency it finds with the variable 'topfreq' until it reaches the end of the linked lists on the the hash table.
My 'node' is a struct comprising of the variables 'freq' (int) and 'word' (128 char). When the loop has nothing else to search for it prints these two values on screen.
The problem is, I can't wrap my head around figuring out how to find the next lowest number from the number I've just found (and this can include another node with the same freq value, so I have to check that the word is not the same too).
void toptenwords()
{
int topfreq = 0;
int minfreq = 0;
char topword[SIZEOFWORD];
for(int p = 0; p < 10; p++) // We need the top 10 frequencies... so we do this 10 times
{
for(int m = 0; m < HASHTABLESIZE; m++) // Go through the entire hast table
{
node* tmp;
tmp = hashtable[m];
while(tmp != NULL) // Walk through the entire linked list
{
if(tmp->freq > topfreq) // If the freqency on hand is larger that the one found, store...
{
topfreq = tmp->freq;
strcpy(topword, tmp->word);
}
tmp = tmp->next;
}
}
cout << topfreq << "\t" << topword << endl;
}
}
Any and all help would be GREATLY appreciated :)

Keep an array of 10 node pointers, and insert each node into the array, maintaining the array in sorted order. The eleventh node in the array is overwritten on each iteration and contains junk.
void toptenwords()
{
int topfreq = 0;
int minfreq = 0;
node *topwords[11];
int current_topwords = 0;
for(int m = 0; m < HASHTABLESIZE; m++) // Go through the entire hast table
{
node* tmp;
tmp = hashtable[m];
while(tmp != NULL) // Walk through the entire linked list
{
topwords[current_topwords] = tmp;
current_topwords++;
for(int i = current_topwords - 1; i > 0; i--)
{
if(topwords[i]->freq > topwords[i - 1]->freq)
{
node *temp = topwords[i - 1];
topwords[i - 1] = topwords[i];
topwords[i] = temp;
}
else break;
}
if(current_topwords > 10) current_topwords = 10;
tmp = tmp->next;
}
}
}

I would maintain a set of words already used and change the inner-most if condition to test for frequency greater than previous top frequency AND tmp->word not in list of words already used.

When iterating over the hash table (and then over each linked list contained therein) keep a self balancing binary tree (std::set) as a "result" list. As you come across each frequency, insert it into the list, then truncate the list if it has more than 10 entries. When you finish, you'll have a set (sorted list) of the top ten frequencies, which you can manipulate as you desire.
There may be perform gains to be had by using sets instead of linked lists in the hash table itself, but you can work that out for yourself.

Step 1 (Inefficient):
Move the vector into a sorted container via insertion sort, but insert into a container (e.g. linkedlist or vector) of size 10, and drop any elements that fall off the bottom of the list.
Step 2 (Efficient):
Same as step 1, but keep track of the size of the item at the bottom of the list, and skip the insertion step entirely if the current item is too small.

Suppose there are n words in total, and we need the most-frequent k words (here, k = 10).
If n is much larger than k, the most efficient way I know of is to maintain a min-heap (i.e. the top element has the minimum frequency of all elements in the heap). On each iteration, you insert the next frequency into the heap, and if the heap now contains k+1 elements, you remove the smallest. This way, the heap is maintained at a size of k elements throughout, containing at any time the k highest-frequency elements seen so far. At the end of processing, read out the k highest-frequency elements in increasing order.
Time complexity: For each of n words, we do two things: insert into a heap of size at most k, and remove the minimum element. Each operation costs O(log k) time, so the entire loop takes O(nlog k) time. Finally, we read out the k elements from a heap of size at most k, taking O(klog k) time, for a total time of O((n+k)log k). Since we know that k < n, O(klog k) is at worst O(nlog k), so this can be simplified to just O(nlog k).

A hash table containing linked lists of words seems like a peculiar data structure to use if the goal is to accumulate are word frequencies.
Nonetheless, the efficient way to get the ten highest frequency nodes is to insert each into a priority queue/heap, such as the Fibonacci heap, which has O(1) insertion time and O(n) deletion time. Assuming that iteration over the hash table table is fast, this method has a runtime which is O(n×O(1) + 10×O(n)) ≡ O(n).

The absolute fastest way to do this would be to use a SoftHeap. Using a SoftHeap, you can find the top 10 items in O(n) time whereas every other solution posted here would take O(n lg n) time.
http://en.wikipedia.org/wiki/Soft_heap
This wikipedia article shows how to find the median in O(n) time using a softheap, and the top 10 is simply a subset of the median problem. You could then sort the items that were in the top 10 if you needed them in order, and since you're always at most sorting 10 items, it's still O(n) time.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to implement a minimum heap sort to find the kth smallest element? - c++

Related

Order Notation of pop-max in a binary heap

Array balancing point

Randomly permute N first elements of a singly linked list

Finding smallest value in an array most efficiently

Top 10 Frequencies in a Hash Table with Linked Lists

Categories

Resources