How should the time complexity be calculated of the following recursive code: - python-2.7

Should the the time complexity be calculated as T(n-1,m-1)/T(n-1,m or T(n-2)?
def isDeelRijRecursief(lijst1,lijst2):
if len(lijst1) == 1: #vergelijking , hieruit constante halen
if len(lijst2) > 1:
return False
if len(lijst2) == 1: #vergelijking
for i in range(len(lijst1)):
if lijst2[0] == lijst1[i]:
return True
return False
else:
if lijst1[0] == lijst2[0]: #vergelijking
return isDeelRijRecursief(lijst1[1:],lijst2[1:]) #T(n-1,m-1)? or T(n-2)?
else:
return isDeelRijRecursief(lijst1[1:],lijst2) # of T(n-1,m)? or T(n-1)?

Time complexity is usually defined in big-O notation to signify the asymptotic complexity of the evaluated function, that is, its complexity in the limit.
Let's say that the sizes of the first and second lists are N and M respectively. The recursive step of your function always creates a new sublist of size N-1. This means that the recursion will bottom-out after at most N steps. The termination condition of the function must also perform at most N operations. This means that time complexity of the function is in fact O(N), that is, the number of operations that this algorithm will perform is on the order of N or that it is asymptotically linear in the size of its first argument.

Related

How to calculate time complexitiy?

I'm really having trouble calculating big O. I get the basics but when it gets to nested for loops and all that, my mind just blanks out. I was asked to write down the complexity of the following algorithm which I have no clue how to do. The input string contains only A,B,C and D
string solution(string &S) {
int length = S.length();
int i = 0;
while(i < length - 1)
{
if ( (S[i] == 'A' && S[i+1] == 'B') || (S[i] == 'B' && S[i+1] == 'A'))
{
S = S.erase(i,2);
i = 0;
length = S.length();
}
if ( (S[i] == 'C' && S[i+1] == 'D') || (S[i] == 'D' && S[i+1] == 'C'))
{
S = S.erase(i,2);
i = 0;
length = S.length();
}
i++;
}
return S;
}
What would the big O of this algorithm be?
It is O(n^2).
DDDDDDDDDDDDDDDDDDDABABABABABABABABABABABAB
First n/2 characters are D
Last n/2 characters are AB
For each AB, (there are 1/4n such) - O(n)
You are resetting i (iterating from start)
shifting all successive elements to fill the gap created after erase.
Total:
O(n)*(O(n) + O(n)) = O(n^2)
It's easy to get hung up about the precise detail of how efficient an algorithm is. Fundamentally though, all you're concerned about is whether the operation is:
Constant time
Proportional to the number of elements
Proportional to the square of the number of elements
etc...
Look at this for guidance on how to estimate the Big-O for a compound operation:
https://hackernoon.com/big-o-for-beginners-622a64760e2
The big-O essentially defines the worst-case complexity of a method, with particular regard to effects that would be observed with very large n. On the face of it you would consider how many times you repeat an operation, but you also need to consider if any embodied methods (e.g. string erase, string length) have complexity that's "constant time", "proportional to the number of elements", "proportional to the number of elements - squared" and so on.
So if your outer loop performs n scans but also invokes methods which also perform n scans on up to every item then you end up with O(n^2).
The main concern is the exponential dimension; you could have a very time-consuming linear-complexity operation, but also a very fast, say, power-of-4 element. In such a case, it's considered to be O(n^4) ( as opposed to O(20000n + n^4) ) because as n tends to infinity, all of the lesser exponent factors become insignificant. See here : https://en.wikipedia.org/wiki/Big_O_notation#Properties
So in your case, you have the following loops:
Repetition of the scan (setting i=0) whose frequency is proportional to number of matches (worst case n for argument's sake - even if it's a fraction, when n becomes infinite it remains significant). Although this is not supposedly the outer loop, it does fundamentally govern how many times the other scans are performed.
String scan whose frequency is proportional to length (n), PLUS Embodied loop in the string erase - n in the worst case. Note these operations are performed in isolation, together governed by the frequency of the aforementioned repetition. As stated elsewhere, O(n)+O(n) reduces to O(n) because we only care about exponent.
So in this case the complexity is O(n^2)
A separate consideration when assessing the performance of any algorithm regards how cache friendly it is; algorithms using hashmaps, linked lists etc are considered prima-facie to be more efficient, but in some cases a O(n^2) algorithm that operates within a cache line and doesn't invoke page faults nor cache flushes can execute a lot faster than a supposedly more efficient algorithm that has memory scattered all over the place.
I guess this would be O(n) because there is one loop thats going through the string.
The longer the string the more time it takes so i would say O(n)
In big O notation, you give the answer for the worst case. Here the worst case will be that the string does not satisfy any if statements. Then time complexity here will be O(n) because there is only one loop.

Calculating Big-O Runtime

I'm in dire need of some guidance with calculating Big-O runtime for the following C++ function:
Fraction Polynomial::solve(const Fraction& x) const{
Fraction rc;
auto it=poly_.begin();
while(it!=poly_.end()){
Term t=*it;
//find x^exp
Fraction curr(1,1);
for(int i=0;i<t.exponent_;i++){
curr=curr*x;
}
rc+=t.coefficient_*curr;
it++;
}
return rc;
}
This is still a new concept to me, so I'm having a bit of trouble with getting it right. I'm assuming that there are at least two operations that happen once (auto it = poly_.begin, and the return rc at the end), but I am not sure how to count the number of operations with the while loop. According to my professor, the correct runtime is not O(n). If anyone could offer any guidance, it would be greatly appreciated. I want to understand how to answer this question, but I couldn't find anything else like this function online, so here I am. Thank you.
I assume you want to evaluate a certain polynomial (let us say A_n*X^n + ... + A_0) in a given point (rational value since it is given as a Fraction).
The first while loop will iterate through all the individual components of your polynomial. For an n-degree polynomial, that will yield n + 1 iterations, so the outer loop alone takes O(n) time.
However, for every term (let us say of rank i)of the polynomial, you have to compute the value of X^i, and that is what your inner for loop does. It computes X^i using a linear method, yielding linear complexity: O(i).
Since you have two nested loops the overall complexity is obtained by multiplying the worst-case time complexities of the loops. The resulting complexity is given by O(n) * O(n) = O(n^2). (First term indicates the complexity of the while loop, while the second one indicates the worst-case time complexity for computing X^i, which is O(n) when i == n).
Assuming this is a n-order polynomial (highest term is raised to the power of n).
In the outer while loop, you will iterate through n+1 terms (0 to n inclusive on both side).
For each term, in the inner for loop, you are going to perform multiplication m times whereby m is the power of current term. Since this is a n-order polynomial, m range from 0 to n. On average, you are going to perform multiplication n/2 times.
The overall complexity will be O((n+1) * (n/2)) = O(n^2)

What is the Big-O of code that uses random number generators?

I want to fill the array 'a' with random values from 1 to N (no repeated values). Lets suppose Big-O of randInt(i, j) is O(1) and this function generates random values from i to j.
Examples of the output are:
{1,2,3,4,5} or {2,3,1,4,5} or {5,4,2,1,3} but not {1,2,1,3,4}
#include<set>
using std::set;
set<int> S;// space O(N) ?
int a[N]; // space O(N)
int i = 0; // space O(1)
do {
int val = randInt(1,N); //space O(1), time O(1) variable val is created many times ?
if (S.find(val) != S.end()) { //time O(log N)?
a[i] = val; // time O(1)
i++; // time O(1)
S.insert(val); // time O(log N) <-- we execute N times O(N log N)
}
} while(S.size() < N); // time O(1)
The While Loop will continue until we generate all the values from 1 to N.
My understanding is that Set sorts the values in logarithmic time log(N), and inserts in log(N).
Big-O = O(1) + O(X*log N) + O(N*log N) = O(X*log N)
Where X the more, the high probability to generate a number that is not in the Set.
time O(X log N)
space O(2N+1) => O(N), we reuse the space of val
Where ?? it is very hard to generate all different numbers each time randInt is executed, so at least I expect to execute N times.
Is the variable X created many times ?
What would be the a good value for X?
Suppose that the RNG is ideal. That is, repeated calls to randInt(1,N) generate an i.i.d. (independent and identically distributed) sequence of values uniformly distributed on {1,...,N}.
(Of course, in reality the RNG won't be ideal. But let's go with it since it makes the math easier.)
Average case
In the first iteration, a random value val1 is chosen which of course is not in the set S yet.
In the next iteration, another random value is chosen.
With probability (N-1)/N, it will be distinct from val1 and the inner conditional will be executed. In this case, call the chosen value val2.
Otherwise (with probability 1/N), the chosen value will be equal to val1. Retry.
How many iterations does it take on average until a valid (distinct from val1) val2 is chosen? Well, we have an independent sequence of attempts, each of which succeeds with probability (N-1)/N, and we want to know how many attempts it takes on average until the first success. This is a geometric distribution, and in general a geometric distribution with success probability p has mean 1/p. Thus, it takes N/(N-1) attempts on average to choose val2.
Similarly, it takes N/(N-2) attempts on average to choose val3 distinct from val1 and val2, and so on. Finally, the N-th value takes N/1 = N attempts on average.
In total the do loop will be executed
times on average. The sum is the N-th harmonic number which can be roughly approximated by ln(N). (There's a well-known better approximation which is a bit more complicated and involves the Euler-Mascheroni constant, but ln(N) is good enough for finding asymptotic complexity.)
So to an approximation, the average number of iterations will be N ln N.
What about the rest of the algorithm? Things like inserting N things into a set also take at most O(N log N) time, so can be disregarded. The big remaining thing is that each iteration you have to check if the chosen random value lies in S, which takes logarithmic time in the current size of S. So we have to compute
which, from numerical experiments, appears to be approximately equal to N/2 * (ln N)^2 for large N. (Consider asking for a proof of this on math.SE, perhaps.) EDIT: See this math.SE answer for a short informal proof, and the other answer to that question for a more formal proof.
So in conclusion, the total average complexity is Θ(N (ln N)^2).
Again, this is assuming that the RNG is ideal.
Worst case
Like xaxxon mentioned, it is in principle possible (though unlikely) that the algorithm will not terminate at all. Thus, the worst case complexity would be O(∞).
That's a very bad algorithm for achieving your goal.
Simply fill the array with the numbers 1 through N and then shuffle.
That's O(N)
https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
To shuffle, pick an index between 0 and N-1 and swap it with index 0. Then pick an index between 1 and N-1 and swap it with index 1. All the way until the end of the list.
In terms of your specific question, it depends on the behavior of your random number generator. If it's truly random, it may never complete. If it's pseudorandom, it depends on the period of the generator. If it has a period of 5, then you'll never have any dupes.
It's catastrophically bad code with complex behaviour. Generating the first number is O(1), Then the second involves a binary search, so a log N, plus a rerun of the generator should the number be found. The chance of getting an new number is p = 1- i/N. So the average number of re-runs is the reciprocal, and gives you another factor of N. So O(N^2 log N).
The way to do it is to generate the numbers, then shuffle them. That's O(N).

Time complexity in recursive function in which recursion reduces size

I have to estimate time complexity of Solve():
// Those methods and list<Element> Elements belongs to Solver class
void Solver::Solve()
{
while(List is not empty)
Recursive();
}
void Solver::Recursive(some parameters)
{
Element WhatCanISolve = WhatCanISolve(some parameters); //O(n) in List size. When called directly from Solve() - will always return valid element. When called by recursion - it may or may not return element
if(WhatCanISolve == null)
return;
//We reduce GLOBAL problem size by one.
List.remove(Element); //This is list, and Element is pointed by iterator, so O(1)
//Some simple O(1) operations
//Now we call recursive function twice.
Recursive(some other parameters 1);
Recursive(some other parameters 2);
}
//This function performs search with given parameters
Element Solver::WhatCanISolve(some parameters)
{
//Iterates through whole List, so O(n) in List size
//Returns first element matching parameters
//Returns single Element or null
}
My first thought was that it should be somwhere around O(n^2).
Then I thought of
T(n) = n + 2T(n-1)
which (according to wolframalpha) expands to:
O(2^n)
However i think that the second idea is false, since n is reduced between recursive calls.
I also did some benchmarking with large sets. Here are the results:
N t(N) in ms
10000 480
20000 1884
30000 4500
40000 8870
50000 15000
60000 27000
70000 44000
80000 81285
90000 128000
100000 204380
150000 754390
Your algorithm is still O(2n), even though it reduces the problem size by one item each time. Your difference equation
T(n) = n + 2T(n-1)
does not account for the removal of an item at each step. But it only removes one item, so the equation should be T(n) = n + 2T(n-1) - 1. Following your example and
Saving the algebra by using WolframAlpha to solve this gives the solution T(n) = (c1 + 4) 2n-1 - n - 2 which is still O(2n). It removes one item, which is not a considerable amount given the other factors (especially the recursion).
A similar example that comes to mind is an n*n 2D matrix. Suppose you're only using it for a triangular matrix. Even though you remove one row to process for each column, iterating through every element still has complexity O(n2), which is the same as if all elements were used (i.e. a square matrix).
For further evidence, I present a plot of your own collected running time data:
Presumably the time is quadratic. If WhatCanISolve returns nullptr, iff the list is empty, then all calls
Recursive(some other parameters 2);
will finish in O(1), because they are run with an empty list. This means, the correct formula is actually
T(n) = C*n + T(n-1)
This means, T(n)=O(n^2), which corresponds well to what we see on the plot.

Fast recursive function for returning n-th Fibonacci number

Can anybody explain how the following code works. The code was given as a fast recursive implementation of a function returning n-th Fibonacci number. I have a general idea of how recursive functions work. I can fully understand the direct recursive implementation of such function, using the definition of Fibonacci numbers, which, however, is not efficient.
The main thing I cannot grasp is what does fib(n – 1, prev0) return when we have garbage stored in prev0.
int fib(int n, int &prev1) {
if (n < 2) {
prev1 = 0;
return n;
}
int prev0;
prev1 = fib(n – 1, prev0);
return prev0 + prev1;
}
I am a beginner, so, please, be as specific as you can.
You probably missed the fact that this function returns two results: One as its return value and one in the "input" parameter passed by reference.
The severe inefficiency of the simple recursive definition of fib is that at each recursive level you must make two different calls to lower levels, even though one of them includes all the work of the other.
By allowing the one that includes all the work of the "other" to also return the result of the "other", you avoid doubling the work at each level.
In the mathematical sense it is no longer a "function" (because of the side effect). But as a function in the programming sense, it sidesteps the efficiency problem of fib by returning two values from one call.
I think it appropriate to mention that in C++, there are more elegant ways to return a pair of values as the result of a function. (Even in C you could return a struct by value).
Edit (in response to your edit):
The main thing I cannot grasp is what does fib(n – 1, prev0) return
when we have garbage stored in prev0.
The trick is that prev0 is an output from the function, not an input.
I am a beginner, so, please, be as specific as you can.
The parameter declaration of int & in the function signature lets the function use that parameter as input or output or both as it chooses. This particular function uses it as output.
If you understand the basics of recursive functions, you understand how each level of the recursion has its own copy of the parameter n and of the local variable prev0. But prev1 is not a separate variable. It is effectively an alias for the higher level's prev0. So any read or write of the current level's prev1 really happens to the higher level's prev0.
This level's n is passed "by value" meaning it is a copy of the value of the passed expression (the higher level's n-1 ). But this level's prev1 is passed by reference so it is not a copy of the value of the higher level's prev0, it is an alias of the higher level's prev0.
Note that prev1 is never read from. Only written to. Let's think about the function this way:
std::pair<int,int> fib_pair(int n) {
if (n < 2) {
return std::make_pair(n, 0);
}
std::pair<int, int> prev = fib_pair(n-1);
return std::make_pair(prev.first + prev.second, prev.first);
}
Now it's clearer - there's the one recursive call, and fib(n) returns the two previous numbers, instead of just the one. As a result, we have a linear function instead of an exponential one. We can then rewrite the original version in terms of this one to help us understand both:
int fib(int n, int &prev1) {
std::pair<int, int> pair = fib_pair(n);
prev1 = pair.second;
return pair.first;
}
Let's look at four different functions that compute the nth fibonacci number, using pseudocode instead of restricting the program to a single language. The first follows the standard recursive definition:
function fib(n) # exponential
if n <= 2 return 1
return fib(n-1) + fib(n-2)
This function requires exponential time, O(2n), because it recomputes previously-computed fibonacci numbers at each step. The second function requires linear time, O(n), by working from 1 to n instead of n to 1 and keeping track of the two previous fibonacci numbers:
function fib(n) # linear
if n <= 2 return 1
prev2 = prev1 = 1
k := 3
while k <= n
fib := prev2 + prev1
prev2 := prev1
prev1 := fib
return fib
That's the same algorithm your program uses, though yours disguises what's going on by operating recursively and passing one of the parameters by a pointer to a variable in an outer scope.
Dijkstra described an algorithm for computing the n fibonacci number in logarithmic time, O(log n), using matrices and the exponentiation by squaring algorithm. I won't give the full explanation here; Dijkstra does it better than I could (though you should beware his convention that F0 = 1 instead of F0 = 0 as we have been doing it). Here's the algorithm:
function fib(n) # logarithmic
if n <= 2 return 1
n2 := n // 2 # integer division
if n % 2 == 1 return square(fib(n2+1)) + square(fib(n2))
return fib(n2) * (2*fib(n2-1) + fib(n2))
The fourth algorithm operates in constant time, O(1), provided you have floating-point numbers with sufficient precision, using the mathematical definition of the fibonacci numbers:
function fib(n) # constant
sqrt5 := sqrt(5)
p := (1 + sqrt5) / 2
q := 1 / p
return floor((p**n + q**n) / sqrt5 + 0.5)
For most languages this last algorithm isn't very useful, because for fibonacci numbers of any size you need some kind of unlimited-precision decimal arithmetic library, and although it's constant time it will probably take longer in practice than the simple logarithmic-time algorithm operating on unlimited-precision integers, at least until n is very large.
The obvious (non efficient) implementation of finding a Fibonacci number would be:
int fib(int n) {
if (n<2) return n;
return fib(n-2) + fib(n-1);
}
This implementation is inefficient you are doing the same calculations twice.
For example if n is 6, you the algorithm would tell you to add fib(4) and fib(5). To find fib(5), you need to add fib(4) and fib(3). And there you go calculating fib(4) for the second time. As n becomes larger, this will become more inefficient.
The example you provided avoids this inefficiency by remembering the previous fibonacci sequence.