What is complexity of this pointer function? - c++

This is code, function receiving a pointer *ptr(that is pointing to character array outside).Loop is simply calculating length. What will be the complexity function of this function. I have calculated.
Is this correct complexity function calculation?
inside while, should i consider 3 operations, *, ptr+c , !=
Astaric * is dereferencing, ptr+c is calculating address, and != for condition.

When we say O(N), we really mean O(N)+c, where c is a constant. This is because with very small n, the characteristics of the complexity may not show themselves, as nonessentials (noise) may dominate more, so this is represented by c. But as N grows, the constant becomes insignificant, as the complexity as relative to N becomes the dominant time. Typically, a constant merely shifts the graph up or down by a small amount, but doesn't change its shape. Thus, we omit it, and just say O(N), which is what we're really interested in knowing.
As for coefficients, same thing happens. If a complexity is proportional to N, it is a linear relationship. The graph of a linear operation, multiplied by X, will have a different slope but its shape is still linear. Thus, the coefficient is not really providing additional information in terms of complexity category.
Similarly, if you have something that is O(N)+O(N^2), the N^2 dominates, and we tend to ignore the smaller terms O(N), and just call it O(N^2) algorithm. It's not an exact science, it's a categorization to understand the nature of the complexity. Big-oh notation is the worst case complexity.
I think you might be interested in the coefficients and constants only when evaluating the relative metrics of specific algorithms that have the same categorical complexity. (Some O(N*lg(n)) sorts are faster than others, for example.) But usually it is measured a different way.

It depends strongly on the definition of "complexity" that you're using (certainly this is not one of the typical asymptotic notations) but, if I properly remember these assignments from this level of education, yes you have counted correctly.
The resulting worst-case asymptotic algorithmic complexity in big-Oh notation would be O(N) (derived by eliding the co-efficients and lower terms (sort of)).

Related

Is it possible for 1 to be O(n)

Im learning O-notation, and I thought that 1 is O(1) because since 1 is considered a constant its Big-O would be 1. However, I'm reading that it can be O(n) as well. How is this possible? Would it be because is n = 1 then it would be the same?
Yes a function that is O(1) is also O(n) -- and O(n2), and O(en), and so on. (And mathematically, you can think of 1 as a function that always has the same value.)
If you look at the formal definition of Big-O notation, you'll see (roughly stated) that a function f(x) is O(g(x)) if g(x) exceeds f(x) by at least some constant factor as x goes to infinity. Quoting the linked article:
Let f and g be two functions defined on some subset of the real
numbers. One writes
*f(x)=O(g(x)) as x --> ∞
if and only if there is a positive
constant M such that for all sufficiently large values of x, the
absolute value of f(x) is at most M multiplied by the absolute value
of g(x). That is, f(x) = O(g(x)) if and only if there exists a
positive real number M and a real number x0 such that
*|f(x)| ≤ M |g(x)| for all x ≥ *x0.
However, we rarely say that an O(1) function or algorithm is O(n), since saying it's O(n) is misleading and doesn't convey as much information. For example, we say that the Quicksort algorithm is O(n log n), and Bubblesort is O(n2). It's strictly true that Quicksort is also O(n2), but there's no point in saying so -- largely because a lot of people aren't familiar with the exact mathematical meaning of Big-O notation.
There's another notation called Big Theta (Θ) which applies tighter bounds.
Actually, Big O Notation shows how the program complexity (it may be time, memory etc.) depends on the problem size.
O(1) means that the program complexity is independent of problem size. e.g. Accessing an array element. No matter which index you select, the time of accessing will be independent of the index.
O(n) means that the program complexity linearly depends on problem size. e.g. If you are linear searching an element in the array, you need to traverse most of the elements of the array. In the worst case, if element is not present in the array, you will be traversing the complete array.
If we increase the size of the array, the complexity say, time complexity will be different i.e. it will take more time to execute if we are traversing 100 elements than the time taken if we are traversing only 10 elements.
I hope this helps you a bit.
The Big O Notation is a mathematical way to explain the behaviour of a function near to a point or to infinity. This is a tool that is used in computer science to analyse the complexity of an algorithm. The complexity of an algorithm helps you analyse if your algorithm suits your situation and process your logic in a reasonable time.
But that does not answer your question. The question that if n is equal to 1 doesn't make sense in Big O notation. Like the name said, it's a notation and not a way to calculate in mathematics. Big O notation means to evaluate the behaviour of this algorithm near to infinite to tell which part of the algorithm is the most significant. For example, if an algorithm has a behaviour that can be represented by the function 2x^2 + 3x the Big O notation says to take each part of this function and evaluate it near to infinite and take the function that is the most significant. So, by evaluating 2x^2 and 3x we will see that 2x^2 will be a bigger infinite that 3x. And the difference between x^2 and 3x are infinite too. So, if we eliminate the coefficients (that are not the variables part of function) we will have two complexities: O(x^2) and O(x), so O(n^2) and O(n). But we know that the most significant is the O(n^2).
It's the same thing if inside a part of code, you have two complexities O(1) and O(n) the O(n) will be your algorithm complexity.
But if a O(n) complexity process only one element the behaviour will be equivalent to O(1). But it doesn’t means that your algorithm has O(1) complexity.

What is the time complexity of the following function?

int func(int n){
if(n==1)
return 0;
else
return sqrt(n);
}
Where sqrt(n) is a C math.h library function.
O(1)
O(lg n)
O(lg lg n)
O(n)
I think that the running time entirely depends on the sqrt(n). However, I don't know how this function is actually implemented.
P.S. The general approach towards finding the square root of a number that I know of is using Newton's method. If I am not wrong, the time complexity using Newton's method turns out to be O(lg n). So should the answer be O(lg n)?
P.P.S. Got this question in a recent test that I appeared for.
I am going to give a bit more general case answer, without assuming constant size of int.
The answer is Theta(logn).
We know newton-raphson is Theta(logn) - that excludes Theta(n) (assuming sqrt() is as efficient as we can).
However, a general number n requries log_2(n) bits to encode - and you require to read all of it in order to get an accurate sqrt() function. This excludes Theta(1) and Theta(log(log(n)).
From the above, we know that the complexity of the function is Theta(log(n)).
As a side note, since O(log(n)) is a subset of O(n) - it is also a valid answer, though not tight one. For more information about big Theta and big O and their differences, you might want to have a look on this thread.
This depends on the implementation of sqrt and also on what kind of time complexity you are interested.
I would say you can consider it to be "constant", so O(1), in that sense: If you put in a random int, it will in average take the same amount of time. (Reason: numbers with many digits are much more common).
But have a look here. Another possible answer is O(M(n)), where M(n) is the complexity of a multiplication and n is the number of digits in your integer.
This looking like a text-book question and a is perhaps meant to be a trap. The teacher perhaps wants to check if you can distinguish between computing sqrt for a list of numbers (which would be O(n)), and a single number (which would be O(1)).
Be aware that the "correct" answer often also depends on the context in which it is asked.
Let n=2^m
Given T(n)=T(sqrt(n))+1
T(2^m)=T(2^m-1)+1
Let T(2^m)=S(m)
then,
S(m)=2S(m/2)+1
using master theorem-
S(m)=theta(m)
=theta(log(n))
Hence, time complexity is theta(log(n)).

Introsort (quicksort + heapsort) implementation and complexity

I've read that C++ uses introsort (introspective sort) for its built-in std::sort where it starts off with quicksort and switches to heapsort when you hit the depth limit.
I've also read that the depth limit is supposed to be 2*log(2,N).
Is this value purely experimental? Or is there some mathematical theory behind it?
If you have an interval (range or array), the number of times you'll have to split the interval in half before you end up with an empty (or one element) interval is log(2,N), that's just a mathematical fact, you can work it out easily, if you want. If all goes perfectly well with quicksort, it should recurse log(2,N) times, for the same reason (and at each recursion level, it has to process all values of the interval, which leads to a O(N*log(2,N)) complexity for the overall algorithm). The problem is that quicksort could require many more recursions (if it keeps getting "unlucky" with picking pivot values, which means that it doesn't split the interval in half, but in an imbalanced way instead). At worse, quicksort could end up recursing N times, which is definitely not acceptable for a production-quality implementation.
Switching to heap-sort at 2*log(2,N) is just a good heuristic in general, to detect a much too deep number of recursions.
Technically, you could base this on the empirical performance of heap-sort versus quick-sort, to figure out what limit is the best. But such tests are highly dependent on the application (what are you sorting? how are you comparing elements? how cheap are the element swaps? etc..). So, most one-size-fits-all implementation, like std::sort, would just pick a reasonable limit like 2*log(2,N).
What #Mikael Persson said regarding why the depth limit is 2*log(2,N) is partly correct. It is not just a good heuristic, or a reasonable limit.
In fact, as you have probably guessed (depicted from your second question), there is an important mathematical reason for this: in tilde notation (search for tilde notation), quicksort makes on average ~2*log(2,N) comparisons. In big-oh notation, this is equivalent to O(N*log(2,N)).
That is why introsort switches to heapsort (which has asymptotic O(N*log(2,N)) complexity) when the depth of the recursion becomes more than 2*log(2,N). You can think of it as something which is not usual to happen and most probably means that something went wrong with the pivot picking and quicksort alone would lead to O(N^2) complexity.
You can find a short mathematical proof of the average number of compares quicksort does here (slide 21).

how does IF affect complexity?

Let's say we have an array of 1.000.000 elements and we go through all of them to check something simple, for example if the first character is "A". From my (very little) understanding, the complexity will be O(n) and it will take some X amount of time. If I add another IF (not else if) to check, let's say, if the last character is "G", how will it change complexity? Will it double the complexity and time? Like O(2n) and 2X?
I would like to avoid taking into consideration the number of calculations different commands have to make. For example, I understand that Len() requires more calculations to give us the result than a simple char comparison does, but let's say that the commands used in the IFs will have (almost) the same amount of complexity.
O(2n) = O(n). Generalizing, O(kn) = O(n), with k being a constant. Sure, with two IFs it might take twice the time, but execution time will still be a linear function of input size.
Edit: Here and Here are explanations, with examples, of the big-O notation which is not too mathematic-oriented
Asymptotic complexity (which is what big-O uses) is not dependent on constant factors, more specifically, you can add / remove any constant factor to / from the function and it will remain equivalent (i.e. O(2n) = O(n)).
Assuming an if-statement takes a constant amount of time, it will only add a constant factor to the complexity.
A "constant amount of time" means:
The time taken for that if-statement for a given element is not dependent on how many other elements there are in the array
So basically if it doesn't call a function which looks through the other elements in the array in some way or something similar to this
Any non-function-calling if-statement is probably fine (unless it contains a statement that goes through the array, which some language allows)
Thus 2 (constant-time) if-statements called for each each element will be O(2n), but this is equal to O(n) (well, it might not really be 2n, more on that in the additional note).
See Wikipedia for more details and a more formal definition.
Note: Apart from not being dependent on constant factors, it is also not dependent on asymptotically smaller terms (terms which remain smaller regardless of how big n gets), e.g. O(n) = O(n + sqrt(n)). And big-O is just an upper bound, so saying it is O(n9999) would also be correct (though saying that in a test / exam will probably get you 0 marks).
Additional note: The problem when not ignoring constant factors is - what classifies as a unit of work? There is no standard definition here. One way is to use the operation that takes the longest, but determining this may not always be straight-forward, nor would it always be particularly accurate, nor would you be able to generically compare complexities of different algorithms.
Some key points about time complexity:
Theta notation - Exact bound, hence if a piece of code which we are analyzing contains conditional if/else and either part has some more code which grows based on input size then exact bound can't be obtained since either of branch might be taken and Theta notation is not advisable for such cases. On the other hand, if both of the branches resolve to constant time code, then Theta notation can be applicable in such case.
Big O notation - Upper bound, so if a code has conditionals where either of the conditional branches might grow with input size n, then we assume max or upper bound to calculate the time consumption by the code, hence we use Big O for such conditionals assuming we take the path that has max time consumption. So, the path which has lower time can be assumed as O(1) in amortized analysis(including the fact that we assume this path has no no recursions that may grow with the input size) and calculate time complexity Big O for the lengthiest path.
Big Omega notation - Lower bound, This is the minimum guaranteed time that a piece of code can take irrespective of the input. Useful for cases where the time taken by code doesn't grow based on input size n, but it consumes a significant amount of time k. In these cases, we can use the lower bound analysis.
Note: All of these notations doesn't depend upon the input being best/avg/worst and all of these can be applied to any piece of code.
So as discussed above, Big O doesn't care about the constant factors such as k and only sees how time increases with respect to growth in n, in which case here it is O(kn) = O(n) linear.
PS: This post was about the relation of big O and conditionals evaluation criteria for amortized analysis.
It's related to a question I posted myself today.
In your example it depends on whether you can jump from the first to the last element and if you can't then it also depends on the average length of each entry.
If as you went down through the array you had to read each full entry in order to evaluate your two if statements then your order would be O(1,000,000xN) where N is the average length of each entry. IF N is variable then it will affect the order. An example would be standard multiplication where we perform Log(N) additions of an entry which is Log(N) in lenght and so the order is O(Log^2(N)) or if you prefer O((Log(N))^2).
On the other hand if you can just check the first and last character then N = 2 and is constant so can be ignored.
This is an IMPORTANT point you have to be careful though because how can you decide if your multipler can be ignored. For example say we were doing Log(N) additions of a Log(N/100) number. Now just because Log(N/100) is the smaller term doesn't mean we can ignore it. The multiplying factor cannot be ignored if it is variable.

Performance of doing bitwise operations on bitsets

In C++ if I do a logical OR (or AND) on two bitsets, for example:
bitset<1000000> b1, b2;
//some stuff
b1 |= b2;
Does this happen in O(n) or O(1) time? Why?
Also, can this be accomplished using an array of bools in O(1) time?
Thanks.
It has to happen in O(N) time since there is a finite number of bits that can be processed in any given chunk of time by a given processor platform. In other words, the larger the bit-set, the longer the amount of time each operation will take, and the increase will be linear with respect to the number of bits in the bitset.
You also end up with the same problem using the array of bool types. While each individual operation itself will take O(1) time, the total amount of time for N objects will be O(N).
It's impossible to perform a logical operation (e.g. OR or AND) on arbitrary arrays of flags in unit time. True Big-Oh analysis deals with runtime as the size of the data tends to infinity, and a Core i7 is never going to OR together a billion bits in the same time it takes to OR together two bit.
I think it needs to be made clear that Big O is a boundary - an asymptotic boundary (minimum time required cannot be less than the f(x)'s Big O., and in in thinking about it, it states the order of magnitude of the speed of a computation. So if you think about how an array works - if you can say I can do this operation all in one computation or so, or there's a known amount that is very small and much less than N, then it is constant. If you need to iterate in some manner (in this case you will see all the bits need to be checked, and there is no short cut for bitwise OR - therefore N bits need to be computed, and therefore it's O(n). [It's actually tighter boundary than that, but we're dealing with just Big O]. An array itself stores N-bits in it.
In fact, few things are really O(1) (index look ups at a known address using a pointer can be O(1) (if you already know what you are looking up). But, if you have M things that need to be looked up in constant time, then it is O(M) * O(1) = O(M).
This is a function of modern day computer - since most things are processed sequentially. (multi-core helps but doesn't come close to affecting big O notation yet). There is of course, the ability of the computer to process words in parallel, but even that is just a constant subtraction. O(n) / O(64) is still O(n).