Both of these algorithms are giving same output but the first one takes nearly double time (>.67) compared to second one (.36). How is this possible? Can you tell me the time complexity of both algorithms? If they're the same, why is the time different?
1st algorithm:
for (int i =0 ;i<n;i++){
cin>>p[i];
if(i>0){
if(p[i-1]>p[i]){
cout<<p[i]<<" ";
}
else{
cout<<"-1"<<" ";
}
}
}
2nd algorithm:
for (int i =0 ;i<n;i++){
cin>>p[i];
}
for (int i =0 ; i<n-1;i++){
if(p[i]>p[i+1]){
cout<<p[i]<<" ";
}
else{
cout<<"-1"<<" ";
}
}
Time complexity in a modern processor can be an almost-useless performance statistic.
In this case we have one algorithm that goes from 0 to n-1--O(N)--and a second that goes 0 to n-1 twice--the constant drops out so it's still O(N). The first algorithm has an extra if statement that will be false exactly once and a decent compiler will obliterate that. We wind up with the same amount of input, the same amount of output, the same amount of array accesses (sort of) and the same amount of if (a>b).
What the second has that the first doesn't is determinism. One loop determines everything for the second. All of the input is read in in the first loop. That means The CPU can see exactly what is going to happen ahead of the time because it has all of the numbers and thus knows exactly how every branch of the if will go and can predict with 100% accuracy, load up the caches, and fill up pipelines to everything is ready ahead of time without missing a beat.
Algorithm 1 can't do that because the next input is not known until the next iteration of the loop. Unless the input pattern is predictable, it's going to guess which way if(p[i-1]>p[i]) is going wrong a lot of the time.
Additional reading: Why is it faster to process a sorted array than an unsorted array?
Related
Below is the program which find the length of the longest substring without repeating characters, given a string str. (details)
int test(string str) {
int left = 0, right = 0, ans = 0;
unordered_set<char> set;
while(left < str.size() and right < str.size()) {
if(set.find(str[right]) == set.end()) set.insert(str[right]);
else {
while(str[left] != str[right]){
set.erase(str[left]);
left++;
}
left++;
}
right++;
ans = (ans > set.size() ? ans : set.size());
}
return ans;
};
What is the time complexity of above solution? Is it O(n^2) or O(n) where n is the length of string?
Please note that I have gone through multiple questions on internet and also read about big oh but I am still confused. To me, it looks like O(n^2) complexity due to two while loops but I want to confirm from experts here.
It's O(n) on average.
What you see here is a sliding window technique (with variable window size, also called "two pointers technique").
Yes there are two loops, but if you look, any iteration of any of the two loops will always increase one of the pointers (either left or right).
In the first loop, either you call the second loop or you don't, but you will increase right at each iteration. The second loop always increases left.
Both left and right can have n different values (because both loops would stop when either right >= n or left == right).
So the first loop will have n executions (all the values of right from 0 to n-1) and the second loop can have at most n executions (all the possible values of left), which is a worst case of 2n = O(n) executions.
Worst case complexity
For the sake of completeness, please note that I wrote O(n) on average. The reason is that set.find has a complexity of O(1) in average but O(n) in the worst case. Same goes for set.erase. The reason is that unordered_set is implemented with a hash table and it the very unlikely case of all your items being in the same bucket, it needs to iterate on all the items.
So even though we have O(n) iterations of the loop, some iterations could be O(n). It means that in some very unlikely cases, the execution could go up to O(n^2). You shouldn't really worry about it as the probability of this to happen is close to 0, and even though I don't exactly know what the hashing technique for char in C++, I would bet that we will never end up with all characters in the same bucket.
In my University we are learning Big O Notation. However, one question that I have in light of big o notation is, how do you convert a simple computer algorithm, say for example, a linear searching algorithm, into a mathematical function, say for example 2n^2 + 1?
Here is a simple and non-robust linear searching algorithm that I have written in c++11. Note: I have disregarded all header files (iostream) and function parameters just for simplicity. I will just be using basic operators, loops, and data types in order to show the algorithm.
int array[5] = {1,2,3,4,5};
// Variable to hold the value we are searching for
int searchValue;
// Ask the user to enter a search value
cout << "Enter a search value: ";
cin >> searchValue;
// Create a loop to traverse through each element of the array and find
// the search value
for (int i = 0; i < 5; i++)
{
if (searchValue == array[i])
{
cout << "Search Value Found!" << endl;
}
else
// If S.V. not found then print out a message
cout << "Sorry... Search Value not found" << endl;
In conclusion, how do you translate an algorithm into a mathematical function so that we can analyze how efficient an algorithm really is using big o notation? Thanks world.
First, be aware that it's not always possible to analyze the time complexity of an algorithm, there are some where we do not know their complexity, so we have to rely on experimental data.
All of the methods imply to count the number of operations done. So first, we have to define the cost of basic operations like assignation, memory allocation, control structures (if, else, for, ...). Some values I will use (working with different models can provide different values):
Assignation takes constant time (ex: int i = 0;)
Basic operations take constant time (+ - * ∕)
Memory allocation is proportional to the memory allocated: allocating an array of n elements takes linear time.
Conditions take constant time (if, else, else if)
Loops take time proportional to the number of time the code is ran.
Basic analysis
The basic analysis of a piece of code is: count the number of operations for each line. Sum those cost. Done.
int i = 1;
i = i*2;
System.out.println(i);
For this, there is one operation on line 1, one on line 2 and one on line 3. Those operations are constant: This is O(1).
for(int i = 0; i < N; i++) {
System.out.println(i);
}
For a loop, count the number of operations inside the loop and multiply by the number of times the loop is ran. There is one operation on the inside which takes constant time. This is ran n times -> Complexity is n * 1 -> O(n).
for (int i = 0; i < N; i++) {
for (int j = i; j < N; j++) {
System.out.println(i+j);
}
}
This one is more tricky because the second loop starts its iteration based on i. Line 3 does 2 operations (addition + print) which take constant time, so it takes constant time. Now, how much time line 3 is ran depends on the value of i. Enumerate the cases:
When i = 0, j goes from 0 to N so line 3 is ran N times.
When i = 1, j goes from 1 to N so line 3 is ran N-1 times.
...
Now, summing all this we have to evaluate N + N-1 + N-2 + ... + 2 + 1. The result of the sum is N*(N+1)/2 which is quadratic, so complexity is O(n^2).
And that's how it works for many cases: count the number of operations, sum all of them, get the result.
Amortized time
An important notion in complexity theory is amortized time. Let's take this example: running operation() n times:
for (int i = 0; i < N; i++) {
operation();
}
If one says that operation takes amortized constant time, it means that running n operations took linear time, even though one particular operation may have taken linear time.
Imagine you have an empty array of 1000 elements. Now, insert 1000 elements into it. Easy as pie, every insertion took constant time. And now, insert another element. For that, you have to create a new array (bigger), copy the data from the old array into the new one, and insert the element 1001. The 1000 first insertions took constant time, the last one took linear time. In this case, we say that all insertions took amortized constant time because the cost of that last insertion was amortized by the others.
Make assumptions
In some other cases, getting the number of operations require to make hypothesises. A perfect example for this is insertion sort, because it is simple and it's running time depends of how is the data ordered.
First, we have to make some more assumptions. Sorting involves two elementary operations, that is comparing two elements and swapping two elements. Here I will consider both of them to take constant time. Here is the algorithm where we want to sort array a:
for (int i = 0; i < a.length; i++) {
int j = i;
while (j > 0 && a[j] < a[j-1]) {
swap(a, i, j);
j--;
}
}
First loop is easy. No matter what happens inside, it will run n times. So the running time of the algorithm is at least linear. Now, to evaluate the second loop we have to make assumptions about how the array is ordered. Usually, we try to define the best-case, worst-case and average case running time.
Best-case: We do never enter the while loop. Is this possible ? Yes. If a is a sorted array, then a[j] > a[j-1] no matter what j is. Thus, we never enter the second loop. So, what operations are done in this case is the assignation on line 2 and the evaluation of the condition on line 3. Both take constant time. Because of the first loop, those operations are ran n times. Then in the best case, insertion sort is linear.
Worst-case: We leave the while loop only when we reach the beginning of the array. That is, we swap every element all the way to the 0 index, for every element in the array. It corresponds to an array sorted in reverse order. In this case, we end up with the first element being swapped 0 times, element 2 is swapped 1 times, element 3 is swapped 2 times, etc up to element n being swapped n-1 times. We already know the result of this: worst-case insertion is quadratic.
Average case: For the average case, we assume the items are randomly distributed inside the array. If you're interested in the maths, it involves probabilities and you can find the proof in many places. Result is quadratic.
Conclusion
Those were basics about analyzing the time complexity of an algorithm. The cases were easy, but there are some algorithms which aren't as nice. For example, you can look at the complexity of the pairing heap data structure which is much more complex.
I have the following 2 codes:
int i=0;
while(i<=1000000000 && i!=-1) {
i++;
}
I think the run-time complexity is 4 billion
In the while condition is 3 operations (i<=1000000000),(i!=-1) and && ,
and
int i=0;
while(i!=-1) {
if(i>=1000000000) break;
i++;
}
Which I think the run-time complexity is 3 billion,
in the while condition is 1 operation (i<=1000000000) in the if is 1 operation (i!=-1),
But when I run it the two code have the same running time so why was that?
I have change the two codes a little bit as follow:
int n = 1000000000;
int i=0;
while(i<=n && i!=-1) {
i++;
}
int n = 1000000000;
int i=0;
while(i!=-1) {
if(i>=n) break;
i++;
}
This time the 3rd code block run in 2.6s and the 4th is 3.1s,
Why was this happened?
What was the time complexity of the four codes?
I use dev-c++ IDE.
Time complexity and actual running time are two very different things.
Time complexity only has meaning when we are talking about variable input size. It tells how well algorithm scales for larger inputs. If we assume that your input is n (or 1000000000 in the first two cases), then all your examples have linear time complexity. It means, roughly, if you take n two times larger, running time is also doubled.
Actual running time somehow depends on complexity, but you can't reliably calculate it. Reasons are: compiler optimizations, CPU optimizations, OS thread management and many others.
I think by 'time complexity' you mean number of primitive operations for computer to execute. Then there are no difference between
while(i<=1000000000 && i!=-1)
and
while(i!=-1) {
if(i>=1000000000) break;
because most likely operator && implemented not as 'take first operand, take second operand, and perform some operation on them', but as a sequence of conditional jumps:
if not FirstCondition then goto FalseBranch
if not SecondCondition then goto FalseBranch
TrueBranch:
... here is your loop body
FalseBranch:
... here is the code after loop
And that's exactly what you did by hands in the second example.
However, this stuff is only makes sense to the specific compiler and optimization settings (in release build your loop will be eliminated entirely by any descent compiler).
int maxValue = m[0][0];
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
{
if ( m[i][j] >maxValue )
{
maxValue = m[i][j];
}
}
}
cout<<maxValue<<endl;
int sum = 0;
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
{
sum = sum + m[i][j];
}
}
cout<< sum <<endl;
For the above mentioned code I got O(n2) as the execution time growth
They way I got it was by:
MAX [O(1) , O(n2), O(1) , O(1) , O(n2), O(1)]
both O(n2) is for for loops. Is this calculation correct?
If I change this code as:
int maxValue = m[0][0];
int sum = 0;
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
{
if ( m[i][j] > maxValue )
{
maxValue = m[i][j];
}
sum += m[i][j];
}
}
cout<<maxValue<<endl;
cout<< sum <<endl;
Still Big O would be O(n2) right?
So does that mean Big O just an indication on how time will grow according to the input data size? and not how algorithm written?
This feels a bit like a homework question to me, but...
Big-Oh is about the algorithm, and specifically how the number of steps performed (or the amount of memory used) by the algorithm grows as the size of the input data grows.
In your case, you are taking N to be the size of the input, and it's confusing because you have a two-dimensional array, NxN. So really, since your algorithm only makes one or two passes over this data, you could call it O(n), where in this case n is the size of your two-dimensional input.
But to answer the heart of your question, your first code makes two passes over the data, and your second code does the same work in a single pass. However, the idea of Big-Oh is that it should give you the order of growth, which means independent of exactly how fast a particular computer runs. So, it might be that my computer is twice as fast as yours, so I can run your first code in about the same time as you run the second code. So we want to ignore those kinds of differences and say that both algorithms make a fixed number of passes over the data, so for the purposes of "order of growth", one pass, two passes, three passes, it doesn't matter. It's all about the same as one pass.
It's probably easier to think about this without thinking about the NxN input. Just think about a single list of N numbers, and say you want to do something to it, like find the max value, or sort the list. If you have 100 items in your list, you can find the max in 100 steps, and if you have 1000 items, you can do it in 1000 steps. So the order of growth is linear with the size of the input: O(n). On the other hand, if you want to sort it, you might write an algorithm that makes roughly a full pass over the data each time it finds the next item to be inserted, and it has to do that roughly once for each element in the list, so that's making n passes over your list of length n, so that's O(n^2). If you have 100 items in your list, that's roughly 10^4 steps, and if you have 1000 items in your list that's roughly 10^6 steps. So the idea is that those numbers grow really fast in comparison to the size of your input, so even if I have a much faster computer (e.g., a model 10 years better than yours), I might be able to to beat you in the max problem even with a list 2 or 10 or even 100 or 1000 times as long. But for the sorting problem with a O(n^2) algorithm, I won't be able to beat you when I try to take on a list that's 100 or 1000 times as long, even with a computer 10 or 20 years better than yours. That's the idea of Big-Oh, to factor out those "relatively unimportant" speed differences and be able to see what amount of work, in a more general/theoretical sense, a given algorithm does on a given input size.
Of course, in real life, it may make a huge difference to you that one computer is 100 times faster than another. If you are trying to solve a particular problem with a fixed maximum input size, and your code is running at 1/10 the speed that your boss is demanding, and you get a new computer that runs 10 times faster, your problem is solved without needing to write a better algorithm. But the point is that if you ever wanted to handle larger (much larger) data sets, you couldn't just wait for a faster computer.
The big O notation is an upper bound to the maximum amount of time taken to execute the algorithm based on the input size. So basically two algorithms can have slightly varying maximum running time but same big O notation.
what you need to understand is that for a running time function that is linear based on input size will have big o notation as o(n) and a quadratic function will always have big o notation as o(n^2).
so if your running time is just n, that is one linear pass, big o notation stays o(n) and if your running time is 6n+c that is 6 linear passes and a constant time c it still is o(n).
Now in the above case the second code is more optimized as the number of times you need to make the skip to memory locations for the loop is less. and hence this will give a better execution. but both the code would still have the asymptotic running time as o(n^2).
Yes, it's O(N^2) in both cases. Of course O() time complexity depends on how you have written your algorithm, but both the versions above are O(N^2). However, note that actually N^2 is the size of your input data (it's an N x N matrix), so this would be better characterized as a linear time algorithm O(n) where n is the size of the input, i.e. n = N x N.
#include<iostream.h>
int main()
{
int a[10]={1,2,3,5,2,3,1,5,3,1};
int i;
int c[10]={0};
for(i = 0 ; i < 10 ; i++)
c[a[i]]++;
for(i=0;i<10;i++)
cout<<i<<": "<<c[i]<<endl;
return 0;
}
The running time of the Algorithm is O(n) but its taking an extra space of O(n). Can I do better?
Thanks!
Depends on what is important to you - you can create an algorithm taking O(n^2) time, but O(1) space (using two loops, see code below), but you can't improve time complexity below O(n).
for(i=0;i<10;i++) {
count = 0;
for(j=0;j<10;j++)
if (c[j] == i) count++;
cout<<i<<": "<<count<<endl;
}
Another possiblity for O(1) space would be an in-place sort of the array and then traversing this once, which should have time complexity O(n log n) using in-place merge sort.
No you can't. That's the best you can do.
What is "efficient"? Show us you performance requirements and performance measurements. Then we can tell you if it's efficient. Until then this is a wide open question with lots of wrong answers and no right answer.
The answers thus far are correct only the word 'efficient' means 'runs as fast possible'.
Maybe you have a fast computer with little RAM.
You can always make a piece of code run faster or use less memory of less disk space or less.... if it is not 'efficient' enough, I have seen guys hand craft assembly to make it faster. Usually it's a waste of time and effort. Optimizing code that has not been profiled is a fools game.
If all the numbers are in range 1 to n then in can be done in O(n) time complexity and O(1) space complexity.
if there is an index X and array A such that A[X]=Y then add N to the value present at index Y.So A[Y] becomes A[Y]=original+N. Continuing this ,values will be of form (original+KN) where k>=0.To retrieve original element we can do (original+KN)%N since (x+kn)%n=x and count can be found by (original+KN)/N.