Complexity of function with array having even and odds numbers separate

Complexity of function with array having even and odds numbers separate - c++

So i have an array which has even and odds numbers in it.
I have to sort it with odd numbers first and then even numbers.
Here is my approach to it:
int key,val;
int odd = 0;
int index = 0;
for(int i=0;i<max;i++)
{
if(arr[i]%2!=0)
{
int temp = arr[index];
arr[index] = arr[i];
arr[i] = temp;
index++;
odd++;
}
}
First I separate even and odd numbers then I apply sorting to it.
For sorting I have this code:
for (int i=1; i<max;i++)
{
key=arr[i];
if(i<odd)
{
val = 0;
}
if(i>=odd)
{
val = odd;
}
for(int j=i; j>val && key < arr[j-1]; j--)
{
arr[j] = arr[j-1];
arr[j-1] = key;
}
}
The problem i am facing is this i cant find the complexity of the above sorting code.
Like insertion sort is applied to first odd numbers.
When they are done I skip that part and start sorting the even numbers.
Here is my approach for sorting if i have sorted array e.g: 3 5 7 9 2 6 10 12
complexity table
How all this works?
in first for loop i traverse through the loop and put all the odd numbers before the even numbers.
But since it doesnt sort them.
in next for loop which has insertion sort. I basically did is only like sorted only odd numbers first in array using if statement. Then when i == odd the nested for loop then doesnt go through all the odd numbers instead it only counts the even numbers and then sorts them.

I'm assuming you know the complexity of your partitioning (let's say A) and sorting algorithms (let's call this one B).
You first partition your n element array, then sort m element, and finally sort n - m elements. So the total complexity would be:
A(n) + B(m) + B(n - m)
Depending on what A and B actually are you should probably be able to simplify that further.
Edit: Btw, unless the goal of your code is to try and implement partitioning/sorting algorithms, I believe this is much clearer:
#include <algorithm>
#include <iterator>
template <class T>
void partition_and_sort (T & values) {
auto isOdd = [](auto const & e) { return e % 2 == 1; };
auto middle = std::partition(std::begin(values), std::end(values), isOdd);
std::sort(std::begin(values), middle);
std::sort(middle, std::end(values));
}
Complexity in this case is O(n) + 2 * O(n * log(n)) = O(n * log(n)).
Edit 2: I wrongly assumed std::partition keeps the relative order of elements. That's not the case. Fixed the code example.

Related

Find duplicate in unsorted array with best time Complexity

I know there were similar questions, but not of such specificity
Input: n-elements array with unsorted emelents with values from 1 to (n-1).
one of the values is duplicate (eg. n=5, tab[n] = {3,4,2,4,1}.
Task: find duplicate with best Complexity.
I wrote alghoritm:
int tab[] = { 1,6,7,8,9,4,2,2,3,5 };
int arrSize = sizeof(tab)/sizeof(tab[0]);
for (int i = 0; i < arrSize; i++) {
tab[tab[i] % arrSize] = tab[tab[i] % arrSize] + arrSize;
}
for (int i = 0; i < arrSize; i++) {
if (tab[i] >= arrSize * 2) {
std::cout << i;
break;
}
but i dont think it is with best possible Complexity.
Do You know better method/alghoritm? I can use any c++ library, but i don't have any idea.
Is it possible to get better complexity than O(n) ?

In terms of big-O notation, you cannot beat O(n) (same as your solution here). But you can have better constants and simpler algorithm, by using the property that the sum of elements 1,...,n-1 is well known.
int sum = 0;
for (int x : tab) {
sum += x;
}
duplicate = sum - ((n*(n-1)/2))
The constants here will be significntly better - as each array index is accessed exactly once, which is much more cache friendly and efficient to modern architectures.
(Note, this solution does ignore integer overflow, but it's easy to account for it by using 2x more bits in sum than there are in the array's elements).

Adding the classic answer because it was requested. It is based on the idea that if you xor a number with itself you get 0. So if you xor all numbers from 1 to n - 1 and all numbers in the array you will end up with the duplicate.
int duplicate = arr[0];
for (int i = 1; i < arr.length; i++) {
duplicate = duplicate ^ arr[i] ^ i;
}

Don't focus too much on asymptotic complexity. In practice the fastest algorithm is not necessarily the one with lowest asymtotic complexity. That is because constants are not taken into account: O( huge_constant * N) == O(N) == O( tiny_constant * N).
You cannot inspect N values in less than O(N). Though you do not need a full pass through the array. You can stop once you found the duplicate:
#include <iostream>
#include <vector>
int main() {
std::vector<int> vals{1,2,4,6,5,3,2};
std::vector<bool> present(vals.size());
for (const auto& e : vals) {
if (present[e]) {
std::cout << "duplicate is " << e << "\n";
break;
}
present[e] = true;
}
}
In the "lucky case" the duplicate is at index 2. In the worst case the whole vector has to be scanned. On average it is again O(N) time complexity. Further it uses O(N) additional memory while yours is using no additional memory. Again: Complexity alone cannot tell you which algorithm is faster (especially not for a fixed input size).
No matter how hard you try, you won't beat O(N), because no matter in what order you traverse the elements (and remember already found elements), the best and worst case are always the same: Either the duplicate is in the first two elements you inspect or it's the last, and on average it will be O(N).

Time complexity of using heaps to find Kth largest element

I have some different implementations of the code for finding the Kth largest element in an unsorted array. The three implementations I use all use either min/max heap, but I am having trouble figuring out the runtime complexity for one of them.
Implementation 1:
int findKthLargest(vector<int> vec, int k)
{
// build min-heap
make_heap(vec.begin(), vec.end(), greater<int>());
for (int i = 0; i < k - 1; i++) {
vec.pop_back();
}
return vec.back();
}
Implementation 2:
int findKthLargest(vector<int> vec, int k)
{
// build max-heap
make_heap(vec.begin(), vec.end());
for (int i = 0; i < k - 1; i++) {
// move max. elem to back (from front)
pop_heap(vec.begin(), vec.end());
vec.pop_back();
}
return vec.front();
}
Implementation 3:
int findKthLargest(vector<int> vec, int k)
{
// max-heap prio. q
priority_queue<int> pq(vec.begin(), vec.end());
for (int i = 0; i < k - 1; i++) {
pq.pop();
}
return pq.top();
}
From my reading, I am under the assumption that the runtime for the SECOND one is O(n) + O(klogn) = O(n + klogn). This is because building the max-heap is done in O(n) and popping it will take O(logn)*k if we do so 'k' times.
However, here is where I am getting confused. For the FIRST one, with a min-heap, I assume building the heap is O(n). Since it is a min-heap, larger elements are in the back. Then, popping the back element 'k' times will cost k*O(1) = O(k). Hence, the complexity is O(n + k).
And similarly, for the third one, I assume the complexity is also O(n + klogn) with the same reasoning I had for the max-heap.
But, some sources still say that this problem cannot be done faster than O(n + klogn) with heaps/pqs! In my FIRST example, I think this complexity is O(n + k), however. Correct me if I'm wrong. Need help thx.

Properly implemented, getting the kth largest element from a min-heap is O((n-k) * log(n)). Getting the kth largest element from a max-heap is O(k * log(n)).
Your first implementation is not at all correct. For example, if you wanted to get the largest element from the heap (k == 1), the loop body would never be executed. Your code assumes that the last element in the vector is the largest element on the heap. That is incorrect. For example, consider the heap:
1
3 2
That is a perfectly valid heap, which would be represented by the vector [1,3,2]. Your first implementation would not work to get the 1st or 2nd largest element from that heap.
The second solution looks like it would work.
Your first two solutions end up removing items from vec. Is that what you intended?
The third solution is correct. It takes O(n) to build the heap, and O((k - 1) log n) to remove the (k-1) largest items. And then O(1) to access the largest remaining item.
There is another way to do it, that is potentially faster in practice. The idea is:
build a min-heap of size k from the first k elements in vec
for each following element
if the element is larger than the smallest element on the heap
remove the smallest element from the heap
add the new element to the heap
return element at the top of the heap
This is O(k) to build the initial heap. Then it's O((n-k) log k) in the worst case for the remaining items. The worst case occurs when the initial vector is in ascending order. That doesn't happen very often. In practice, a small percentage of items are added to the heap, so you don't have to do all those removals and insertions.
Some heap implementations have a heap_replace method that combines the two steps of removing the top element and adding the new element. That reduces the complexity by a constant factor. (i.e. rather than an O(log k) removal followed by an O(log k) insertion, you get an constant time replacement of the top element, followed by an O(log k) sifting it down the heap).

This is heap solution for java. We remove all elements which are less than kth element from the min heap. After that we will have kth largest element at the top of the min heap.
class Solution {
int kLargest(int[] arr, int k) {
PriorityQueue<Integer> heap = new PriorityQueue<>((a, b)-> Integer.compare(a, b));
for(int a : arr) {
heap.add(a);
if(heap.size()>k) {
// remove smallest element in the heap
heap.poll();
}
}
// return kth largest element
return heap.poll();
}
}
The worst case time complexity will be O(NlogK) where N is total no of elements. You will be using 1 heapify operation when inserting initial k elements in heap. After that you'll be using 2 operations(1 insert and 1 remove). So this makes the worst case time complexity O(NlogK). You can improve it with some other methods and bring the average case time complexity of heap update to Θ(1). Read this for more info.
Quickselect: Θ(N)
If you're looking for a faster solution on average. Quickselect algorithm which is based on quick sort is a good option. It provides average case time complexity of O(N) and O(1) space complexity. Of course worst case time complexity is O(N^2) however randomized pivot(used in following code) yields very low probability for such scenario. Following is code for quickselect algo for finding kth largest element.
class Solution {
public int findKthLargest(int[] nums, int k) {
return quickselect(nums, k);
}
private int quickselect(int[] nums, int k) {
int n = nums.length;
int start = 0, end = n-1;
while(start<end) {
int ind = partition(nums, start, end);
if(ind == n-k) {
return nums[ind];
} else if(ind < n-k) {
start = ind+1;
} else {
end = ind-1;
}
}
return nums[start];
}
private int partition(int[] nums, int start, int end) {
int pivot = start + (int)(Math.random()*(end-start));
swap(nums, pivot, end);
int left=start;
for(int curr=start; curr<end; curr++) {
if(nums[curr]<nums[end]) {
swap(nums, left, curr);
left++;
}
}
swap(nums, left, end);
return left;
}
private void swap(int[] nums, int i, int j) {
int temp = nums[i];
nums[i] = nums[j];
nums[j] = temp;
}
}

Trying to understand the Binary Insertion Sort?

Could anyone please tell me how this code sorts the array? i don't get it! and how is this code reducing the complexity of a regular insertion sort?
// Function to sort an array a[] of size 'n'
void insertionSort(int a[], int n)
{
int i, loc, j, k, selected;
for (i = 1; i < n; ++i)
{
j = i - 1;
selected = a[i];
// find location where selected sould be inseretd
loc = binarySearch(a, selected, 0, j);
// Move all elements after location to create space
while (j >= loc)
{
a[j+1] = a[j];
j--;
}
a[j+1] = selected;
}
}

This code uses the fact that the portion of the array from zero, inclusive, to i, exclusive, is already sorted. That's why it can run binarySearch for the insertion location of a[i], rather than searching for it linearly.
This clever trick does not change the asymptotic complexity of the algorithm, because the part where elements from loc to i are moved remains linear. In the worst case (which happens when the array is sorted in reverse) each of the N insertion steps will make i moves, for a total of N(N-1)/2 moves.
The only improvement that this algorithm has over the classic insertion sort is the number of comparisons. If comparisons of objects being sorted are computationally expensive, this algorithm can significantly reduce the constant factor.

Heapsort CPU time

I have implemented Heapsort in c++, it indeed sorts the array, but is giving me higher CPU times than expected. It is supposed to spend nlog(n) flops, and it is supposed to sort it faster than, at least, bubblesort and insertionsort.
Instead, it is giving me higher cpu times than both bubblesort and insertion sort. For example, for a random array of ints (size 100000), I have the following cpu times (in nanoSeconds):
BubbleSort: 1.0957e+11
InsertionSort: 4.46416e+10
MergeSort: 7.2381e+08
HeapSort: 2.04685e+11
This is the code itself:
#include <iostream>
#include <assert.h>
#include <fstream>
#include <vector>
#include <random>
#include <chrono>
using namespace std;
typedef vector<int> intv;
typedef vector<float> flov;
typedef vector<double> douv;
void max_heapify(intv& , int);
void build_max_heap(intv& v);
double hesorti(intv& v)
{
auto t0 =chrono::high_resolution_clock::now();
build_max_heap(v);
int x = 0;
int i = v.size() - 1;
while( i > x)
{
swap(v[i],v[x]);
++x;
--i;
}
auto t1 = chrono::high_resolution_clock::now();
double T = chrono::duration_cast<chrono::nanoseconds>(t1-t0).count();
return T;
}
void max_heapify(intv& v, int i)
{
int left = i + 1, right = i + 2;
int largest;
if( left <= v.size() && v[left] > v[i])
{
largest = left;
}
else
{
largest = i;
}
if( right <= v.size() && v[right] > v[largest])
{
largest = right;
}
if( largest != i)
{
swap(v[i], v[largest]);
max_heapify(v,largest);
}
}
void build_max_heap(intv& v)
{
for( int i = v.size() - 2; i >= 0; --i)
{
max_heapify(v, i);
}
}

There's definitely a problem with the implementation of heap sort.
Looking at hesorti, you can see that it is just reversing the elements of the vector after calling build_max_heap. So somehow build_max_heap isn't just making a heap, it's actually reverse sorting the whole array.
max_heapify already has an issue: in the standard array layout of a heap, the children of the node at array index i are not i+1 and i+2, but 2i+1 and 2i+2. It's being called from the back of the array forwards from build_max_heap. What does this do?
The first time it is called, on the last two elements (when i=n-2), it simply makes sure the larger comes before the smaller. What happens when it is called after that?
Let's do some mathematical induction. Suppose, for all j>i, after calling max_heapify with index j on an array where the numbers v[j+1] through v[n-1] are already in descending order, that the result is that the numbers v[j] through v[n-1] are sorted in descending order. (We've already seen this is true when i=n-2.)
If v[i] is greater or equal to v[i+1] (and therefore v[i+2] as well), no swaps will occur and when max_heapify returns, we know that the values at i through n-1 are in descending order. What happens in the other case?
Here, largest is set to i+1, and by our assumption, v[i+1] is greater than or equal to v[i+2] (and in fact all v[k] for k>i+1) already, so the test against the 'right' index (i+2) never succeeds. v[i] is swapped with v[i+1], making v[i] the largest of the numbers from v[i] through v[n-1], and then max_heapify is called on the elements from i+1 to the end. By our induction assumption, this will sort those elements in descending order, and so we know that now all the elements from v[i] to v[n-1] are in descending order.
Through the power of induction then, we've proved that build_max_heap will reverse sort the elements. The way it does it, is to percolate the elements in turn, working from the back, into their correct position in the reverse-sorted elements that come after it.
Does this look familiar? It's an insertion sort! Except it's sorting in reverse, so when hesorti is called, the sequence of swaps puts it in the correct order.
Insertion sort also has O(n^2) average behaviour, which is why you're getting similar numbers as for bubble sort. It's slower almost certainly because of the convoluted implementation of the insertion step.
TL;DR: Your heap sort is not faster because it isn't actually a heap sort, it's a backwards insert sort followed by an in-place ordering reversal.

Find the biggest 3 numbers in a vector

I'm trying to make a function to get the 3 biggest numbers in a vector. For example:
Numbers: 1 6 2 5 3 7 4
Result: 5 6 7
I figured I could sort them DESC, get the 3 numbers at the beggining, and after that resort them ASC, but that would be a waste of memory allocation and execution time. I know there is a simpler solution, but I can't figure it out. And another problem is, what if I have only two numbers...
BTW: I use as compiler BorlandC++ 3.1 (I know, very old, but that's what I'll use at the exam..)
Thanks guys.
LE: If anyone wants to know more about what I'm trying to accomplish, you can check the code:
#include<fstream.h>
#include<conio.h>
int v[1000], n;
ifstream f("bac.in");
void citire();
void afisare_a();
int ultima_cifra(int nr);
void sortare(int asc);
void main() {
clrscr();
citire();
sortare(2);
afisare_a();
getch();
}
void citire() {
f>>n;
for(int i = 0; i < n; i++)
f>>v[i];
f.close();
}
void afisare_a() {
for(int i = 0;i < n; i++)
if(ultima_cifra(v[i]) == 5)
cout<<v[i]<<" ";
}
int ultima_cifra(int nr) {
return nr - 10 * ( nr / 10 );
}
void sortare(int asc) {
int aux, s;
if(asc == 1)
do {
s = 0;
for(int i = 0; i < n-1; i++)
if(v[i] > v[i+1]) {
aux = v[i];
v[i] = v[i+1];
v[i+1] = aux;
s = 1;
}
} while( s == 1);
else
do {
s = 0;
for(int i = 0; i < n-1; i++)
if(v[i] < v[i+1]) {
aux = v[i];
v[i] = v[i+1];
v[i+1] = v[i];
s = 1;
}
} while(s == 1);
}
Citire = Read
Afisare = Display
Ultima Cifra = Last digit of number
Sortare = Bubble Sort

If you were using a modern compiler, you could use std::nth_element to find the top three. As is, you'll have to scan through the array keeping track of the three largest elements seen so far at any given time, and when you get to the end, those will be your answer.
For three elements that's a trivial thing to manage. If you had to do the N largest (or smallest) elements when N might be considerably larger, then you'd almost certainly want to use Hoare's select algorithm, just like std::nth_element does.

You could do this without needing to sort at all, it's doable in O(n) time with linear search and 3 variables keeping your 3 largest numbers (or indexes of your largest numbers if this vector won't change).

Why not just step through it once and keep track of the 3 highest digits encountered?
EDIT: The range for the input is important in how you want to keep track of the 3 highest digits.

Use std::partial_sort to descending sort the first c elements that you care about. It will run in linear time for a given number of desired elements (n log c) time.

If you can't use std::nth_element write your own selection function.
You can read about them here: http://en.wikipedia.org/wiki/Selection_algorithm#Selecting_k_smallest_or_largest_elements

Sort them normally and then iterate from the back using rbegin(), for as many as you wish to extract (no further than rend() of course).
sort will happen in place whether ASC or DESC by the way, so memory is not an issue since your container element is an int, thus has no encapsulated memory of its own to manage.

Yes sorting is good. A especially for long or variable length lists.
Why are you sorting it twice, though? The second sort might actually be very inefficient (depends on the algorithm in use). A reverse would be quicker, but why even do that? If you want them in ascending order at the end, then sort them into ascending order first ( and fetch the numbers from the end)

I think you have the choice between scanning the vector for the three largest elements or sorting it (either using sort in a vector or by copying it into an implicitly sorted container like a set).

If you can control the array filling maybe you could add the numbers ordered and then choose the first 3 (ie), otherwise you can use a binary tree to perform the search or just use a linear search as birryree says...

Thank #nevets1219 for pointing out that the code below only deals with positive numbers.
I haven't tested this code enough, but it's a start:
#include <iostream>
#include <vector>
int main()
{
std::vector<int> nums;
nums.push_back(1);
nums.push_back(6);
nums.push_back(2);
nums.push_back(5);
nums.push_back(3);
nums.push_back(7);
nums.push_back(4);
int first = 0;
int second = 0;
int third = 0;
for (int i = 0; i < nums.size(); i++)
{
if (nums.at(i) > first)
{
third = second;
second = first;
first = nums.at(i);
}
else if (nums.at(i) > second)
{
third = second;
second = nums.at(i);
}
else if (nums.at(i) > third)
{
third = nums.at(i);
}
std::cout << "1st: " << first << " 2nd: " << second << " 3rd: " << third << std::endl;
}
return 0;
}

The following solution finds the three largest numbers in O(n) and preserves their relative order:
std::vector<int>::iterator p = std::max_element(vec.begin(), vec.end());
int x = *p;
*p = std::numeric_limits<int>::min();
std::vector<int>::iterator q = std::max_element(vec.begin(), vec.end());
int y = *q;
*q = std::numeric_limits<int>::min();
int z = *std::max_element(vec.begin(), vec.end());
*q = y; // restore original value
*p = x; // restore original value

A general solution for the top N elements of a vector:
Create an array or vector topElements of length N for your top N elements.
Initialise each element of topElements to the value of your first element in your vector.
Select the next element in the vector, or finish if no elements are left.
If the selected element is greater than topElements[0], replace topElements[0] with the value of the element. Otherwise, go to 3.
Starting with i = 0, swap topElements[i] with topElements[i + 1] if topElements[i] is greater than topElements[i + 1].
While i is less than N, increment i and go to 5.
Go to 3.
This should result in topElements containing your top N elements in reverse order of value - that is, the largest value is in topElements[N - 1].

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Complexity of function with array having even and odds numbers separate - c++

Related

Find duplicate in unsorted array with best time Complexity

Time complexity of using heaps to find Kth largest element

Trying to understand the Binary Insertion Sort?

Heapsort CPU time

Find the biggest 3 numbers in a vector

Categories

Resources