Insertion and Bubble Algorithm Theory - bubble-sort

What is the different between the insertion sort algorithm and bubble sort algorithm?
I have searched everywhere but I didn't find the exact answer

Insertion Sort divides your array in two parts, a sorted one and an unsorted one. The algorithm takes the first element of the unsorted part and inserts it in the correct place in the sorted part. Because it tries to place each element as they occur, the sorted part is possibly very often rewritten, which is rather costly.
Bubble Sort in contrast iterates over the array and compares two values at a time. The bigger (or smaller [depending on your implementation]) value gets pushed to the end of the array (it bubbles up) and then it looks at the next two values (the one it just bubbled up and the next one). When the algorithm worked through the array the biggest (or smallest) value is the last value in the array. It repeats this procedure (leaving sorted values at the end of the array untouched) until the array is sorted. If you don't swap the values each time, but just mark the biggest value, you can implement this with one swap each iteration.

Related

Fastest way to remove an element from an array?

I've coded an algorithm designed to produce a list of the closest triplets from three sorted arrays (one element contributed by each array). The algorithm finds the closest triplet, removes those elements, then repeats the process. The algorithm itself runs quite fast, however the code I'm using to "remove" the elements from each array slows it down significantly. Is there a more efficient method, or is random removal necessarily always O(n)?
My current strategy (effectively identical to std::move):
for(int i = 6; i < n; ++i)
array[i] = array[i+1];
Where n is the size of the array, and 6 is the index of the element to be removed.
Fastest way to remove an element from an array?
Note that there is no way to erase elements from an array. The size of an array cannot change. So to be clear, we are considering algorithm where the resulting array contains the elements excluding the "removed" value at the beginning of the array, with some irrelevant value in the end.
The algorithm that you show1 is the optimal one if there is an additional constraint that the order of other elements must not change. It can be slightly improved by using move assignment if the element type is non-trivial, but that doesn't improve asymptotic complexity. There is no need to write the loop, since there is a standard algorithm: std::move (the two-argument overload from <algorithm>).
If there is no constraint of stable order, then there is a more efficient algorithm: Only write the last element over the "removed" one.
is random removal [from array] necessarily always O(n)?
Only when the remaining elements need to have a stable order.
1 However, there is a bug in your implementation:
for(int i = 6; i < n; ++i)
array[i] = array[i+1];
Where n is the size of the array
If n is the size of the array, then array[n-1+1] is outside the bounds of the array.
There are a few more options you can consider.
Validity masks
You can have an additional array of bool initially everything is set to false to day values are not deleted. To delete the value you set the corresponding bool to true (or the other way if it makes more sense in your code).
This requires a bit of tweaks to the rest of the code to skip values that are marked as deleted.
Tombstones
Similar to the solution above, but doesn't require additional memory. If there's a value that it's not used (say all the values are supposed to be positive, then we can use -1) you can set the entry to that value. This also requires tweaks in the rest of the code to skip it.
Delayed deletion
This one is a bit more complicated. I'd only use it if iterating over the deleted entries significantly affects performance or complexity.
The idea is to tombsone or mark the entries as deleted. Next time you iterate over the array you also do the swaps. This makes the code kind of complex. The easiest, I think, you can do it is using custom iterators.
This is still O(N), but it's amortized O(1) within the overall algorithm.
Also note that if you do a loop O(N) to find the element to delete and than do another loop O(N) to delete it, then the overall solution is still O(N).

How can I generate arrays for testing Best Case of quickSort?

I want to test time complexity of quickSort but I dont know how to generate arrays for testing best case. My version of quickSort takes as pivot the last element of the array.
Assuming all the array elements are different, you get the worst case obviously if you always pick either the smallest or largest element. That will give you the worst recursion depth and maximum number of comparisons.
But for the worst case, you will also want many exchanges. Examine how many elements your implementation of quicksort moves when the last element is the smallest, or when it is the largest element in the array. Decide what's worst. Then arrange the numbers in your array so that the last element in each subarray is always the worst case.

keep std vector/list sorted while insert, or sort all

Lets say I have 30000 objects in my vector/list. Which I add one by one.
I need them sorted.
Is it faster to sort all at once (like std::sort), or keep vector/list sorted while I add object one by one?
vector/list WILL NOT be modified later.
When you are keeping your vector list sorted while inserting elements one by one , you are basically performing an insertion sort, that theoretically runs O(n^2) in worst case. The average case is also quadratic, which makes insertion sort impractical for sorting large arrays.
With your input of ~30000 , it will be better to take all inputs and then sort it with a faster sorting algorithm.
EDIT:
As #Veritas pointed out, We can use faster algorithm to search the position for the element (like binary search). So the whole process will take O(nlg(n)) time.
Though , It may also be pointed that here inserting the elements is also a factor to be taken into account. The worst case for inserting elements takes O(n^2) that is still the overall running time if we want to keep the array sorted.
Sorting after input is still by far the better method rather than keeping it sorted after each iteration.
Keeping the vector sorted during insertion would result in quadratic performance since on average you'll have to shift down approximately half the vector for each item inserted. Sorting once at the end would be n log(n), rather faster.
Depending on your needs it's also possible that set or map may be more appropriate.

Buuble sort to find five most smallest element in an array

I have an array list with some integer values, and I need to find only the five smallest element from the list. Is it efficient to use bubble sort than using any other sorting algorithm? or what is the best algorithm for this?
The common approach is to use a binary heap to track the n smallest elements while scanning from one end to the other.
However, for five elements, it might be just as efficient to track the five smallest seen so far in a simple array. For each new element you inspect, if it's smaller than all the elements in the array, replace the largest with the new element.

How to efficiently compare two vectors in C++, whose content can't be meaningfully sorted?

How to efficiently compare two vectors in C++, whose content can't be meaningfully sorted?
I read many posts but most talk about first sorting the two vectors and then comparing
the elements. But in my case I can't sort the vectors in a meaningful way that can
help the comparison. That means I will have to do an O(N^2) operation rather than O(N),
I will have to compare each element from the first vector and try to find a unique match
for it in the second vector. So I will have to match up all the elements and if I can
find a unique match for each then the vectors are equal.
Is there an efficient and simple way to do this? Will I have to code it myself?
Edit: by meaningful sorting I mean a way to sort them so that later you can
compare them in linear fashion.
Thank you
If the elements can be hashed in some meaningful way, you can get expected O(n) performance by using a hashmap: Insert all elements from list A into the hashmap, and for each element in list B, check if it exists in the hashmap.
In C++, I believe that unordered_map is the standard hashmap implementation (though I haven't used it myself).
Put all elements of vector A into a hash table, where the element is the key, and the value is a counter of how many times you’ve added the element. [O(n) expected]
Iterate over vector B and decrease the counters in the hash table for each element. [O(n) expected]
Iterate over the hash table and check that each counter is 0. [O(n) expected]
= O(n) expected runtime.
No reason to use a map, since there's no values to associate with, just keys (the elements themselves). In this case you should look at using a set or an unordered_set. Put all elements from A into a new set, and then loop through B. For each element, if( set.find(element) == set.end() ) return false;
If you're set on sorting the array against some arbitrary value, you might want to look at C++11's hash, which returns a size_t. Using this you can write a comparator which hashes the two objects and compares them, which you can use with std::sort to perform a O(n log n) sort on it.
If you really can't sort the vectors you could try this. C++ gurus please free to point out flaws, obvious failures to exploit STL and other libraries, failures to comprehend previous answers, etc, etc :) Apologies in advance as necessary.
Have a vector of ints, 0..n, called C. These ints are the indices of each element in vector B. For each element in vector A compare it against elements in B according to the B indices that are in C. If you find a match remove that index from C. C is now one shorter. For the next A you're again searching B according to indices in C, which being one shorter, takes less time. And if you get lucky it will be quite quick.
Or you could build up a vector of B indices that you have already checked so that you ignore those B's next time round the loop. Saves building a complete C first.