Algorithm analysis: Am I analyzing these algorithms correctly? How to approach problems like these [closed] - c++

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
1)
x = 25;
for (int i = 0; i < myArray.length; i++)
{
if (myArray[i] == x)
System.out.println("found!");
}
I think this one is O(n).
2)
for (int r = 0; r < 10000; r++)
for (int c = 0; c < 10000; c++)
if (c % r == 0)
System.out.println("blah!");
I think this one is O(1), because for any input n, it will run 10000 * 10000 times. Not sure if this is right.
3)
a = 0
for (int i = 0; i < k; i++)
{
for (int j = 0; j < i; j++)
a++;
}
I think this one is O(i * k). I don't really know how to approach problems like this where the inner loop is affected by variables being incremented in the outer loop. Some key insights here would be much appreciated. The outer loop runs k times, and the inner loop runs 1 + 2 + 3 + ... + k times. So that sum should be (k/2) * (k+1), which would be order of k^2. So would it actually be O(k^3)? That seems too large. Again, don't know how to approach this.
4)
int key = 0; //key may be any value
int first = 0;
int last = intArray.length-1;;
int mid = 0;
boolean found = false;
while( (!found) && (first <= last) )
{
mid = (first + last) / 2;
if(key == intArray[mid])
found = true;
if(key < intArray[mid])
last = mid - 1;
if(key > intArray[mid])
first = mid + 1;
}
This one, I think is O(log n). But, I came to this conclusion because I believe it is a binary search and I know from reading that the runtime is O(log n). I think it's because you divide the input size by 2 for each iteration of the loop. But, I don't know if this is the correct reasoning or how to approach similar algorithms that I haven't seen and be able to deduce that they run in logarithmic time in a more verifiable or formal way.
5)
int currentMinIndex = 0;
for (int front = 0; front < intArray.length; front++)
{
currentMinIndex = front;
for (int i = front; i < intArray.length; i++)
{
if (intArray[i] < intArray[currentMinIndex])
{
currentMinIndex = i;
}
}
int tmp = intArray[front];
intArray[front] = intArray[currentMinIndex];
intArray[currentMinIndex] = tmp;
}
I am confused about this one. The outer loop runs n times. And the inner for loop runs
n + (n-1) + (n-2) + ... (n - k) + 1 times? So is that O(n^3) ??

More or less, yes.
1 is correct - it seems you are searching for a specific element in what I assume is an un-sorted collection. If so, the worst case is that the element is at the very end of the list, hence O(n).
2 is correct, though a bit strange. It is O(1) assuming r and c are constants and the bounds are not variables. If they are constant, then yes O(1) because there is nothing to input.
3 I believe that is considered O(n^2) still. There would be some constant factor like k * n^2, drop the constant and you got O(n^2).
4 looks a lot like a binary search algorithm for a sorted collection. O(logn) is correct. It is log because at each iteration you are essentially halving the # of possible choices in which the element you are looking for could be in.
5 is looking like a bubble sort, O(n^2), for similar reasons to 3.

O() doesn't mean anything in itself: you need to specify if you are counting the "worst-case" O, or the average-case O. For some sorting algorithm, they have a O(n log n) on average but a O(n^2) in worst case.
Basically you need to count the overall number of iterations of the most inner loop, and take the biggest component of the result without any constant (for example if you have k*(k+1)/2 = 1/2 k^2 + 1/2 k, the biggest component is 1/2 k^2 therefore you are O(k^2)).
For example, your item 4) is in O(log(n)) because, if you work on an array of size n, then you will run one iteration on this array, and the next one will be on an array of size n/2, then n/4, ..., until this size reaches 1. So it is log(n) iterations.

Your question is mostly about the definition of O().
When someone say this algorithm is O(log(n)), you have to read:
When the input parameter n becomes very big, the number of operations performed by the algorithm grows at most in log(n)
Now, this means two things:
You have to have at least one input parameter n. There is no point in talking about O() without one (as in your case 2).
You need to define the operations that you are counting. These can be additions, comparison between two elements, number of allocated bytes, number of function calls, but you have to decide. Usually you take the operation that's most costly to you, or the one that will become costly if done too many times.
So keeping this in mind, back to your problems:
n is myArray.Length, and the number of operations you're counting is '=='. In that case the answer is exactly n, which is O(n)
you can't specify an n
the n can only be k, and the number of operations you count is ++. You have exactly k*(k+1)/2 which is O(n2) as you say
this time n is the length of your array again, and the operation you count is ==. In this case, the number of operations depends on the data, usually we talk about 'worst case scenario', meaning that of all the possible outcome, we look at the one that takes the most time. At best, the algorithm takes one comparison. For the worst case, let's take an example. If the array is [[1,2,3,4,5,6,7,8,9]] and you are looking for 4, your intArray[mid] will become successively, 5, 3 and then 4, and so you would have done the comparison 3 times. In fact, for an array which size is 2^k + 1, the maximum number of comparison is k (you can check). So n = 2^k + 1 => k = ln(n-1)/ln(2). You can extend this result to the case when n is not = 2^k + 1, and you will get complexity = O(ln(n))
In any case, I think you are confused because you don't exactly know what O(n) means. I hope this is a start.

Related

What is the time complexity of this code? Is it O(logn) or O(loglogn)?

int n = 8; // In the video n = 8
int p = 0;
for (int i = 1; i < n; i *= 2) { // In the video i = 1
p++;
}
for (int j = 1; j < p; j *= 2) { // In the video j = 1
//code;
}
This is code from Abdul Bari Youtube channel ( link of the video), they said time complexity of this is O(loglogn) but I think it is O(log), what is the correct answer?
Fix the initial value. 0 multiplied by 2 will never end the loop.
The last loop is O(log log N) because p == log(n). However, the first loop is O(log N), hence in total it is also O(log N).
On the other hand, once you put some code in place of //code then the first loop can be negligible compared to the second and we have:
O ( log N + X * log log N)
^ first loop
^ second loop
and when X is just big enough, one can consider it as O( log log N) in total. However strictly speaking that is wrong, because complexity is about asymptotic behavior and no matter how big X, for N going to infinity, log N will always be bigger than X * log log N at some point.
PS: I assumed that //code does not depend on N, ie it has constant complexity. The above consideration changes if this is not the case.
PPS: In general complexity is important when designing algorithms. When using an algorithm it is rather irrelevant. In that case you rather care about actual runtime for your specific value of N. Complexity can be misleading and even lead to wrong expectations for a specific use case with given N.
You are correct, the time complexity of the complete code is O(log(n)).
But, Abdul Bari Sir is also correct, Because:-
In the video, Abdul Sir is trying to find the time complexity of the second for loop and not the time complexity of the whole code. Take a look at the video again and listen properly what he is saying at this time https://youtu.be/9SgLBjXqwd4?t=568
Once again, what he has derived is the time complexity of the second loop and not the time complexity of the complete code. Please listen to what he says at 9 mins and 28 secs in the video.
If your confusion is clear, please mark this as correct.
The time complexity of
int n;
int p = 0;
for (int i = 1; i < n; i *= 2) { // start at 1, not at 0
p++;
}
is O(log(n)), because you do p++ log2(n) times. The logarithms base does not matter in big O notation, because it just scales by a constant.
for (int j = 1; j < p; j *= 2) {
//code;
}
has O(log(log(n)), because you only loop up to p=log(n) by multiplying, so you have O(log(p)), so O(log(log(n)).
However, both together still are O(log(n)), because O(log(n)+log(log(n)))=O(log(n)

Lower time complexity of two for loop and optimize this to become 1 for loop [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I want to optimize this loop. Its time complexity is n2. I want something like n or log(n).
for (int i = 1; i <= n; i++) {
for (int j = i+1; j <= n; j++) {
if (a[i] != a[j] && a[a[i]] == a[a[j]]) {
x = 1;
break;
}
}
}
The a[i] satisfy 1 <= a[i] <= n.
This is what I will try :
Let us call B the image by a[], i.e. the set {a[i]}: B = {b[k]; k = 1..K, such that i exists, a[i] = b[k]}
For each b[k] value, k = 1..K, determine the set Ck = {i; a[i] = b[k]}.
Determinate of B and the Ck could be done in linear time.
Then let us examine the sets Ck one by one.
If Card(Ck} = 1 : k++
If Card(Ck) > 1 : if two elements of Ck are elements of B, then x = 1 ; else k++
I will use a table (std::vector<bool>) to memorize if an element of 1..N belongs to B or not.
I hope not having made a mistake. No time to write a programme just now. I could do it later on, but I guess you will be able to do it easily.
Note: I discovered after sending this answer that #Mike Borkland proposed something similar already in a comment...
Since sometimes you need to see a solution to learn, I'm providing you with a small function that does the job you want. I hope it helps.
#define MIN 1
#define MAX 100000 // 10^5
int seek (int *arr, int arr_size)
{
if(arr_size > MAX || arr_size < MIN || MIN < 1)
return 0;
unsigned char seen[arr_size];
unsigned char indices[arr_size];
memset(seen, 0, arr_size);
memset(indices, 0, arr_size);
for(int i = 0; i < arr_size; i++)
{
if (arr[i] <= MAX && arr[i] >= MIN && !indices[arr[i]] && seen[arr[arr[i]]])
return 1;
else
{
seen[arr[arr[i]]] = 1;
indices[arr[i]] = 1;
}
}
return 0;
}
Ok, how and why this works? First, let's take a look at the problem the one the original algorithm is trying to solve; they say half of the solution is a well-stated problem. The problem is to find if in a given integer array A of size n whose elements are bound between one and n ([1,n]) there exist two elements in A, x and y such that x != y and Ax = Ay (the array at the index x and y, respectively). Furthermore, we are seeking for an algorithm with good time complexity so that for n = 10000 the implementation runs within one second.
To begin with, let's start analyzing the problem. In the worst case scenario, the array needs to be completely scanned at least one time to decide if such pair of elements exist within the array. So, we can't do better than O(n). But, how would you do that? One possible way is to scan the array and record if a given index has appeared, this can be done in another array B (of size n); likewise, record if a given number that corresponds to A at the index of the scanned element has appeared, this can also be done in another array C. If while scanning the current element of the array has not appeared as an index and it has appeared as an element, then return yes. I have to say that this is a "classical trick" of using hash-table-like data structures.
The original tasks were: i) to reduce the time complexity (from O(n^2)), and ii) to make sure the implementation runs within a second for an array of size 10000. The proposed algorithm runs in O(n) time and space complexity. I tested with random arrays and it seems the implementation does its job much faster than required.
Edit: My original answer wasn't very useful, thanks for pointing that out. After checking the comments, I figured the code could help a bit.
Edit 2: I also added the explanation on how it works so it might be useful. I hope it helps :)
I want to optimize this loop. Its time complexity is n2. I want something like n or log(n).
Well, the easiest thing is to sort the array first. That's O(n log(n)), and then a linear scan looking for two adjacent elements is also O(n), so the dominant complexity is unchanged at O(n log(n)).
You know how to use std::sort, right? And you know the complexity is O(n log(n))?
And you can figure out how to call std::adjacent_find, and you can see that the complexity must be linear?
The best possible complexity is linear time. This only allows us to make a constant number of linear traversals of the array. That means, if we need some lookup to determine for each element, whether we saw that value before - it needs to be constant time.
Do you know any data structures with constant time insertion and lookups? If so, can you write a simple one-pass loop?
Hint: std::unordered_set is the general solution for constant-time membership tests, and Damien's suggestion of std::vector<bool> is potentially more efficient for your particular case.

Predict an algorithm's theoretical average-case efficiency and order of growth using summation

I need to predict the algorithm's average case efficiency with respect to the size of its inputs using summation/sigma notation to arrive at the final answer. Many resources use summation to predict worst-case, and I couldn't find someone explaining how to predict average case so step-by-step answers are appreciated.
The algorithm contains a nested for loop, with the basic operation inside the innermost loop:
[code redacted]
EDIT: The execution of the basic operation it will always execute inside the second for loop if the second for loop has been entered, and has no break or return statements. HOWEVER: the end of the first for loop has the return statement which is dependent on the value produced in the basic operation, so the contents of the array do affect how many total times the basic operation will be executed for each run of the algorithm.
The array passed to the algorithm has randomly generated contents
I think the predicted average case efficiency is (n^2)/2, making it n^2 order of growth/big Theta of n^2, but I don't know how to theoretically prove this using summation.
Answers are very appreciated!
TL;DR: Your code complexity in average case is Θ(n²) if "basic operation" complexity is Θ(1) and it has no return, break or goto operators.
Explanation: the average-case complexity is just an expectation of the number of operations in your code given the size of the input.
Let's say T(A, n) is a number of operations your code performs given array A of size n. It's easy to see that
T(A, n) = 1 + // int k = ceil(size/2.0);
n * 2 + 1 + // for (int i = 0; i < size; i++){
n * (n * 2 + 1) + // for(int j = 0; j < size; j++){
n * n * X + // //Basic operation
1 // return (some int);
Where X is a number of operations in your "basic operation". As we can see, T(A, n) does not depend on actual contents of the array A. Thus, the expected number of operations given size of the array (which is simply the arithmetical mean of T(A, n) for all possible A for given n) is exactly equal to each of them:
T(n) = T(A, n) = 3 + n * 2 + n * n * (2 + X)
If we assume that X = Θ(1), this expression is Θ(n²).
Even without this assumption we can have an estimate: if X = Θ(f(n)), then your code complexity is T(n) = Θ(f(n)n²). For example, if X is Θ(log n), T(n) = Θ(n² log n)

I don't understand the shell sort complexity with shell gap 8,4,2,1 [duplicate]

First, here's my Shell sort code (using Java):
public char[] shellSort(char[] chars) {
int n = chars.length;
int increment = n / 2;
while(increment > 0) {
int last = increment;
while(last < n) {
int current = last - increment;
while(current >= 0) {
if(chars[current] > chars[current + increment]) {
//swap
char tmp = chars[current];
chars[current] = chars[current + increment];
chars[current + increment] = tmp;
current -= increment;
}
else { break; }
}
last++;
}
increment /= 2;
}
return chars;
}
Is this a correct implementation of Shell sort (forgetting for now about the most efficient gap sequence - e.g., 1,3,7,21...)? I ask because I've heard that the best-case time complexity for Shell Sort is O(n). (See http://en.wikipedia.org/wiki/Sorting_algorithm). I can't see this level of efficiency being realized by my code. If I added heuristics to it, then yeah, but as it stands, no.
That being said, my main question now - I'm having difficulty calculating the Big O time complexity for my Shell sort implementation. I identified that the outer-most loop as O(log n), the middle loop as O(n), and the inner-most loop also as O(n), but I realize the inner two loops would not actually be O(n) - they would be much less than this - what should they be? Because obviously this algorithm runs much more efficiently than O((log n) n^2).
Any guidance is much appreciated as I'm very lost! :P
The worst-case of your implementation is Θ(n^2) and the best-case is O(nlogn) which is reasonable for shell-sort.
The best case ∊ O(nlogn):
The best-case is when the array is already sorted. The would mean that the inner if statement will never be true, making the inner while loop a constant time operation. Using the bounds you've used for the other loops gives O(nlogn). The best case of O(n) is reached by using a constant number of increments.
The worst case ∊ O(n^2):
Given your upper bound for each loop you get O((log n)n^2) for the worst-case. But add another variable for the gap size g. The number of compare/exchanges needed in the inner while is now <= n/g. The number of compare/exchanges of the middle while is <= n^2/g. Add the upper-bound of the number of compare/exchanges for each gap together: n^2 + n^2/2 + n^2/4 + ... <= 2n^2 ∊ O(n^2). This matches the known worst-case complexity for the gaps you've used.
The worst case ∊ Ω(n^2):
Consider the array where all the even positioned elements are greater than the median. The odd and even elements are not compared until we reach the last increment of 1. The number of compare/exchanges needed for the last iteration is Ω(n^2).
Insertion Sort
If we analyse
static void sort(int[] ary) {
int i, j, insertVal;
int aryLen = ary.length;
for (i = 1; i < aryLen; i++) {
insertVal = ary[i];
j = i;
/*
* while loop exits as soon as it finds left hand side element less than insertVal
*/
while (j >= 1 && ary[j - 1] > insertVal) {
ary[j] = ary[j - 1];
j--;
}
ary[j] = insertVal;
}
}
Hence in case of average case the while loop will exit in middle
i.e 1/2 + 2/2 + 3/2 + 4/2 + .... + (n-1)/2 = Theta((n^2)/2) = Theta(n^2)
You saw here we achieved (n^2)/2 even though divide by two doesn't make more difference.
Shell Sort is nothing but insertion sort by using gap like n/2, n/4, n/8, ...., 2, 1
mean it takes advantage of Best case complexity of insertion sort (i.e while loop exit) starts happening very quickly as soon as we find small element to the left of insert element, hence it adds up to the total execution time.
n/2 + n/4 + n/8 + n/16 + .... + n/n = n(1/2 + 1/4 + 1/8 + 1/16 + ... + 1/n) = nlogn (Harmonic Series)
Hence its time complexity is some thing close to n(logn)^2

step by step process of finding selection sort big theta notation

I'm having trouble figuring the process of finding the big theta notation for this selection sort sample. I've read online that and the tl;dr's that nested loops means it will = O(n^2)however, I don't know how they got it. I need a step by step process of finding the notation, i.e adding the cost of operations and everything. would be nice if someone did it for this sample code, so I can understand it more clearly. Thanks in advance...
void select(int selct[])
{
int key;
int comp;
for (int i = 0; i < 5; i++)
{
key = i;
for (int j = i + 1; j < 5; j++)
{
if (selct[key] > selct[j])
{
key = j;
}
}
comp = selct[i];
selct[i] = selct[key];
selct[key] = comp;
}
};
When analyzing the time complexity of an algorithm, I actually find it helpful to not look at the code and to instead think about the core idea driving the algorithm. If you know conceptually what the algorithm is doing, it's often easier to figure out the time complexity by just thinking through what the algorithm is going to do and then deriving the time complexity from there.
Let's apply that approach here. So how exactly does selection sort work? Well, it starts off by finding the minimum value in the last n elements and swapping it to position 0, then finding the minimum value in the last n - 1 elements and swapping it to position 1, then finding the minimum value in the last n - 2 elements and swapping it to position 2, etc.
The "hard part" of the algorithm is figuring out which of the last n - k elements is the smallest. Selection sort does this by iterating over those elements and comparing each against the element that currently is known to be the smallest. That requires n - k - 1 comparisons.
Let's see how many comparisons that is. On the first iteration, we need to make n - 1 comparisons. On the second iteration, we make n - 2 comparisons. On the third, we make n - 3 comparisons. Summing up the number of comparisons gives us a good way of measuring the total work:
(n - 1) + (n - 2) + (n - 3) + ... + 3 + 2 + 1 = n(n - 1) / 2
This is a famous summation - it's worth committing it to memory - and tells us how many comparisons are required. The number of comparisons made is a great proxy for the total amount of work done. Since there are n(n - 1) / 2 = n2 / 2 - n / 2 = Θ(n2) comparisons made, the time complexity of selection sort is Θ(n2).