Quadratic Sieve - What does o(1) stand for? - c++

I'm trying to implement the Quadratic Sieve, and i noticed i need to choose a smoothness bound B to use this algorithm. I found in the web that B also stands for exp((1/2 + o(1))(log n log log n)^(1/2)) but now my problem is o(1). Could you tell me what does o(1) stand for?

Let's start with your answer:
The definition of f(n) being o(1) is that limn→∞f(n)=0. That means that for all ϵ>0 there exists Nϵ, depending on ϵ, such that for all n≥Nϵ we have |f(n)|≤ϵ.
Or in plain English:
The notation o(1) means "a function that converges to 0."
This is a fantastic resource: http://bigocheatsheet.com
Look at the Notation for asymptotic growth section
The answer can also be found in this duplicate post: Difference between Big-O and Little-O Notation
f ∈ O(g) says, essentially
For at least one choice of a constant k > 0, you can find a constant a such that the inequality f(x) < k g(x) holds for all x > a.
Note that O(g) is the set of all functions for which this condition holds.
f ∈ o(g) says, essentially
For every choice of a constant k > 0, you can find a constant a such that the inequality f(x) < k g(x) holds for all x > a.

O(1) means it takes constant time, unaffected by input size.
o(1) (slightly different!) means the function it represents converges to 0.
I wouldn't worry too much about the smoothness bound, write the rest of the much more complicated algorithm first, using very simple smoothness formula. (first 100,000 primes, or first n primes where n = c *log(number)) Once the rest of the algorithm is working (and perhaps optimized?) then choosing a smoothness bound carefully will actually have a significant effect. That long complicated formula you gave in the question is the approximate (asymptotic) running time for the quadratic sieve algorithm itself, I'm pretty sure it is unrelated to choosing the smoothness bound.

Related

Is it possible for 1 to be O(n)

Im learning O-notation, and I thought that 1 is O(1) because since 1 is considered a constant its Big-O would be 1. However, I'm reading that it can be O(n) as well. How is this possible? Would it be because is n = 1 then it would be the same?
Yes a function that is O(1) is also O(n) -- and O(n2), and O(en), and so on. (And mathematically, you can think of 1 as a function that always has the same value.)
If you look at the formal definition of Big-O notation, you'll see (roughly stated) that a function f(x) is O(g(x)) if g(x) exceeds f(x) by at least some constant factor as x goes to infinity. Quoting the linked article:
Let f and g be two functions defined on some subset of the real
numbers. One writes
*f(x)=O(g(x)) as x --> ∞
if and only if there is a positive
constant M such that for all sufficiently large values of x, the
absolute value of f(x) is at most M multiplied by the absolute value
of g(x). That is, f(x) = O(g(x)) if and only if there exists a
positive real number M and a real number x0 such that
*|f(x)| ≤ M |g(x)| for all x ≥ *x0.
However, we rarely say that an O(1) function or algorithm is O(n), since saying it's O(n) is misleading and doesn't convey as much information. For example, we say that the Quicksort algorithm is O(n log n), and Bubblesort is O(n2). It's strictly true that Quicksort is also O(n2), but there's no point in saying so -- largely because a lot of people aren't familiar with the exact mathematical meaning of Big-O notation.
There's another notation called Big Theta (Θ) which applies tighter bounds.
Actually, Big O Notation shows how the program complexity (it may be time, memory etc.) depends on the problem size.
O(1) means that the program complexity is independent of problem size. e.g. Accessing an array element. No matter which index you select, the time of accessing will be independent of the index.
O(n) means that the program complexity linearly depends on problem size. e.g. If you are linear searching an element in the array, you need to traverse most of the elements of the array. In the worst case, if element is not present in the array, you will be traversing the complete array.
If we increase the size of the array, the complexity say, time complexity will be different i.e. it will take more time to execute if we are traversing 100 elements than the time taken if we are traversing only 10 elements.
I hope this helps you a bit.
The Big O Notation is a mathematical way to explain the behaviour of a function near to a point or to infinity. This is a tool that is used in computer science to analyse the complexity of an algorithm. The complexity of an algorithm helps you analyse if your algorithm suits your situation and process your logic in a reasonable time.
But that does not answer your question. The question that if n is equal to 1 doesn't make sense in Big O notation. Like the name said, it's a notation and not a way to calculate in mathematics. Big O notation means to evaluate the behaviour of this algorithm near to infinite to tell which part of the algorithm is the most significant. For example, if an algorithm has a behaviour that can be represented by the function 2x^2 + 3x the Big O notation says to take each part of this function and evaluate it near to infinite and take the function that is the most significant. So, by evaluating 2x^2 and 3x we will see that 2x^2 will be a bigger infinite that 3x. And the difference between x^2 and 3x are infinite too. So, if we eliminate the coefficients (that are not the variables part of function) we will have two complexities: O(x^2) and O(x), so O(n^2) and O(n). But we know that the most significant is the O(n^2).
It's the same thing if inside a part of code, you have two complexities O(1) and O(n) the O(n) will be your algorithm complexity.
But if a O(n) complexity process only one element the behaviour will be equivalent to O(1). But it doesn’t means that your algorithm has O(1) complexity.

Linear algorithm to find minimum subset sum over a threshold

I have a collection of N positive integers, each bounded by a (relatively small) constant C. I want to find a subset of these numbers with the smallest sum greater than (or equal to) a value K.
The numbers involved aren't terribly large (<100), but I need good performance even in the worst case. I thought maybe I could adapt Pisinger's dynamic programming algorithm to the task; it runs in O(NC) time, and I happen to meet the requirements of bounded, positive numbers.
[Edit]: The numbers are not sorted and there may be duplicates.
However, I don't understand the algorithm well enough to do this myself. In fact, I'm not even certain if the assumptions it is based on still hold...
-Is it possible to adapt this algorithm to my needs?
-Or is there another linear algorithm I could use that is similarly efficient?
-Could anyone provide pseudocode or a detailed explanation?
Thanks.
Link to the Subset-Sum code I was investigating:
Fast solution to Subset sum algorithm by Pisinger
(Apologies if this is poorly worded/formatted/etc. I'm still new to StackOverflow...)
Pisinger's algorithm gives you the largest sum less than or equal to the capacity of the knapsack. To solve your problem, use Pisinger to figure out what not to put in the subset. Formally, let the items be w_1, ..., w_n and the minimum be K. Give w_1, ..., w_n and w_1 + ... + w_n - K to Pisinger, then take every item that Pisinger does not.
Well one solution is to:
T = {0}
for x in V
for t in T
T.insert(x+t)
for i in K to max(T)
if (T.contains(i))
return i
fail
This gives you the size of the subset, but you can adapt to output the members.
The maximum size of T is O(N) (because of C bound), so the running time is O(N^2) and the space is O(N). You can use a bit array of length NC as the backing store of T.

how does IF affect complexity?

Let's say we have an array of 1.000.000 elements and we go through all of them to check something simple, for example if the first character is "A". From my (very little) understanding, the complexity will be O(n) and it will take some X amount of time. If I add another IF (not else if) to check, let's say, if the last character is "G", how will it change complexity? Will it double the complexity and time? Like O(2n) and 2X?
I would like to avoid taking into consideration the number of calculations different commands have to make. For example, I understand that Len() requires more calculations to give us the result than a simple char comparison does, but let's say that the commands used in the IFs will have (almost) the same amount of complexity.
O(2n) = O(n). Generalizing, O(kn) = O(n), with k being a constant. Sure, with two IFs it might take twice the time, but execution time will still be a linear function of input size.
Edit: Here and Here are explanations, with examples, of the big-O notation which is not too mathematic-oriented
Asymptotic complexity (which is what big-O uses) is not dependent on constant factors, more specifically, you can add / remove any constant factor to / from the function and it will remain equivalent (i.e. O(2n) = O(n)).
Assuming an if-statement takes a constant amount of time, it will only add a constant factor to the complexity.
A "constant amount of time" means:
The time taken for that if-statement for a given element is not dependent on how many other elements there are in the array
So basically if it doesn't call a function which looks through the other elements in the array in some way or something similar to this
Any non-function-calling if-statement is probably fine (unless it contains a statement that goes through the array, which some language allows)
Thus 2 (constant-time) if-statements called for each each element will be O(2n), but this is equal to O(n) (well, it might not really be 2n, more on that in the additional note).
See Wikipedia for more details and a more formal definition.
Note: Apart from not being dependent on constant factors, it is also not dependent on asymptotically smaller terms (terms which remain smaller regardless of how big n gets), e.g. O(n) = O(n + sqrt(n)). And big-O is just an upper bound, so saying it is O(n9999) would also be correct (though saying that in a test / exam will probably get you 0 marks).
Additional note: The problem when not ignoring constant factors is - what classifies as a unit of work? There is no standard definition here. One way is to use the operation that takes the longest, but determining this may not always be straight-forward, nor would it always be particularly accurate, nor would you be able to generically compare complexities of different algorithms.
Some key points about time complexity:
Theta notation - Exact bound, hence if a piece of code which we are analyzing contains conditional if/else and either part has some more code which grows based on input size then exact bound can't be obtained since either of branch might be taken and Theta notation is not advisable for such cases. On the other hand, if both of the branches resolve to constant time code, then Theta notation can be applicable in such case.
Big O notation - Upper bound, so if a code has conditionals where either of the conditional branches might grow with input size n, then we assume max or upper bound to calculate the time consumption by the code, hence we use Big O for such conditionals assuming we take the path that has max time consumption. So, the path which has lower time can be assumed as O(1) in amortized analysis(including the fact that we assume this path has no no recursions that may grow with the input size) and calculate time complexity Big O for the lengthiest path.
Big Omega notation - Lower bound, This is the minimum guaranteed time that a piece of code can take irrespective of the input. Useful for cases where the time taken by code doesn't grow based on input size n, but it consumes a significant amount of time k. In these cases, we can use the lower bound analysis.
Note: All of these notations doesn't depend upon the input being best/avg/worst and all of these can be applied to any piece of code.
So as discussed above, Big O doesn't care about the constant factors such as k and only sees how time increases with respect to growth in n, in which case here it is O(kn) = O(n) linear.
PS: This post was about the relation of big O and conditionals evaluation criteria for amortized analysis.
It's related to a question I posted myself today.
In your example it depends on whether you can jump from the first to the last element and if you can't then it also depends on the average length of each entry.
If as you went down through the array you had to read each full entry in order to evaluate your two if statements then your order would be O(1,000,000xN) where N is the average length of each entry. IF N is variable then it will affect the order. An example would be standard multiplication where we perform Log(N) additions of an entry which is Log(N) in lenght and so the order is O(Log^2(N)) or if you prefer O((Log(N))^2).
On the other hand if you can just check the first and last character then N = 2 and is constant so can be ignored.
This is an IMPORTANT point you have to be careful though because how can you decide if your multipler can be ignored. For example say we were doing Log(N) additions of a Log(N/100) number. Now just because Log(N/100) is the smaller term doesn't mean we can ignore it. The multiplying factor cannot be ignored if it is variable.

Missing number(s) Interview Question Redux

The common interview problem of determining the missing value in a range from 1 to N has been done a thousand times over. Variations include 2 missing values up to K missing values.
Example problem: Range [1,10] (1 2 4 5 7 8 9 10) = {3,6}
Here is an example of the various solutions:
Easy interview question got harder: given numbers 1..100, find the missing number(s)
My question is that seeing as the simple case of one missing value is of O(n) complexity and that the complexity of the larger cases converge at roughly something larger than O(nlogn):
Couldn't it just be easier to answer the question by saying sort (mergesort) the range and iterate over it observing the missing elements?
This solution should take no more than O(nlogn) and is capable of solving the problem for ranges other than 1-to-N such as 10-to-1000 or -100 to +100 etc...
Is there any reason to believe that the given solutions in the above SO link will be better than the sorting based solution for larger number of missing values?
Note: It seems a lot of the common solutions to this problem, assume an only number theoretic approach. If one is being asked such a question in an S/E interview wouldn't it be prudent to use a more computer science/algorithmic approach, assuming the approach is on par with the number theoretic solution's complexity...
More related links:
https://mathoverflow.net/questions/25374/duplicate-detection-problem
How to tell if an array is a permutation in O(n)?
You are only specifying the time complexity, but the space complexity is also important to consider.
The problem complexity can be specified in term of N (the length of the range) and K (the number of missing elements).
In the question you link, the solution of using equations is O(K) in space (or perhaps a bit more ?), as you need one equation per unknown value.
There is also the preservation point: may you alter the list of known elements ? In a number of cases this is undesirable, in which case any solution involving reordering the elements, or consuming them, must first make a copy, O(N-K) in space.
I cannot see faster than a linear solution: you need to read all known elements (N-K) and output all unknown elements (K). Therefore you cannot get better than O(N) in time.
Let us break down the solutions
Destroying, O(N) space, O(N log N) time: in-place sort
Preserving, O(K) space ?, O(N log N) time: equation system
Preserving, O(N) space, O(N) time: counting sort
Personally, though I find the equation system solution clever, I would probably use either of the sorting solutions. Let's face it: they are much simpler to code, especially the counting sort one!
And as far as time goes, in a real execution, I think the "counting sort" would beat all other solutions hands down.
Note: the counting sort does not require the range to be [0, X), any range will do, as any finite range can be transposed to the [0, X) form by a simple translation.
EDIT:
Changed the sort to O(N), one needs to have all the elements available to sort them.
Having had some time to think about the problem, I also have another solution to propose. As noted, when N grows (dramatically) the space required might explode. However, if K is small, then we could change our representation of the list, using intervals:
{4, 5, 3, 1, 7}
can be represented as
[1,1] U [3,5] U [7,7]
In the average case, maintaining a sorted list of intervals is much less costly than maintaining a sorted list of elements, and it's as easy to deduce the missing numbers too.
The time complexity is easy: O(N log N), after all it's basically an insertion sort.
Of course what's really interesting is that there is no need to actually store the list, thus you can feed it with a stream to the algorithm.
On the other hand, I have quite a hard time figuring out the average space complexity. The "final" space occupied is O(K) (at most K+1 intervals), but during the construction there will be much more missing intervals as we introduce the elements in no particular order.
The worst case is easy enough: N/2 intervals (think odd vs even numbers). I cannot however figure out the average case though. My gut feeling is telling me it should be better than O(N), but I am not that trusting.
Whether the given solution is theoretically better than the sorting one depends on N and K. While your solution has complexity of O(N*log(N)), the given solution is O(N*K). I think that the given solution is (same as the sorting solution) able to solve any range [A, B] just by transforming the range [A, B] to [1, N].
What about this?
create your own set containing all the numbers
remove the given set of numbers from your set (no need to sort)
What's left in your set are the missing numbers.
My question is that seeing as the [...] cases converge at roughly
something larger than O(nlogn) [...]
In 2011 (after you posted this question) Caf posted a simple answer that solves the problem in O(n) time and O(k) space [where the array size is n - k].
Importantly, unlike in other solutions, Caf's answer has no hidden memory requirements (using bit array's, adding numbers to elements, multiplying elements by -1 - these would all require O(log(n)) space).
Note: The question here (and the original question) didn't ask about the streaming version of the problem, and the answer here doesn't handle that case.
Regarding the other answers: I agree that many of the proposed "solutions" to this problem have dubious complexity claims, and if their time complexities aren't better in some way than either:
count sort (O(n) time and space)
compare (heap) sort (O(n*log(n)) time, O(1) space)
...then you may as well just solve the problem by sorting.
However, we can get better complexities (and more importantly, genuinely faster solutions):
Because the numbers are taken from a small, finite range, they can be 'sorted' in linear time.
All we do is initialize an array of 100 booleans, and for each input, set the boolean corresponding to each number in the input, and then step through reporting the unset booleans.
If there are total N elements where each number x is such that 1 <= x <= N then we can solve this in O(nlogn) time complexity and O(1) space complexity.
First sort the array using quicksort or mergesort.
Scan through the sorted array and if the difference between previously scanned number, a and current number, b is equal to 2 (b - a = 2), then the missing number is a+1. This can be extended to condition where (b - a > 2).
Time complexity is O(nlogn)+O(n) almost equal to O(nlogn) when N > 100.
I already answered it HERE
You can also create an array of boolean of the size last_element_in_the_existing_array + 1.
In a for loop mark all the element true that are present in the existing array.
In another for loop print the index of the elements which contains false AKA The missing ones.
Time Complexity: O(last_element_in_the_existing_array)
Space Complexity: O(array.length)
If the range is given to you well ahead, in this case range is [1,10] you can perform XOR operation with your range and the numbers given to you. Since XOR is commutative operation. You will be left with {3,6}
(1 2 3 4 5 6 7 8 9 10) XOR (1 2 4 5 7 8 9 10) ={3,6}

complexity about going from beginning to end and back through a vector

I am trying to be familiar with the complexity evaluation of algorithms. In general I think that is a good/elegant practice, but in the specific I need it to express time complexity of my C++ code.
I have a small doubt. Suppose I have an algorithm that just reads data from the beginning of a std::vector until the end; then it does the same starting from the end to beginning (so are 2 cycles for indexes "From 0 To N" followed by "From N To 0").
I said to myself that the complexity for this stuff is O(2N): is this correct?
Once I reached the beginning, suppose that I want to start reading again all data from beginning to the end (passing in total 3 times the vector): is the complexity O(3N)?
It is maybe a stupid doubt, but I would like to have someone opinion anyway about my thinking process.
Big-O notation simply means:
f(n) = O( g(n) ) if and only if f(n) / g(n) does not grow to infinity as n increases
What you have to do is count the number of operations you're performing, which is f(n), and then find a function g(n) that increases at least as fast as f.
In your example of going one way and then back, the number of operations is f(n) = 2n because each element is read twice, so, you can choose g(n) = n. Since f(n) / g(n) = 2n / n = 2 obviously does not grow to infinity (it's a constant), you have an O(n) algorithm.
It's also an O(2n) algorithm, of course : since the "grow to infinity" property does not change when you multiply g(n) by a constant, any O( g(n) ) is also by definition an O( C g(n) ) algorithm for any constant C.
And it's also an O(n²) algorithm, because 2n / n² = 2 / n decreases towards zero. Big-O notation only provides an upper bound on the complexity.
O(N), O(2N) and O(3N) are equivalent. Multiplying a constant factor to the function inside the O( ) won't change its complexity as "linear".
It is true, however, that each scan will perform N reads in either direction, i.e. it will perform 2N ∈ O(N) reads when scanning from start to end to start, and 3N ∈ O(N) reads when scanning from start to end to start to end.
It's important to get a working feel for Big-O notation. I'll try to convey that...
As you say, your algorithm intuitively is "O(2N)", but imagine someone else writes an algorithm that iterates only once (therefore clearly O(N)) but spends twice as long processing each node, or a hundred times as long. You can see that O(2N) is only very weakly suggestive of something slower than an O(N) algorithm: not knowing what the operations are, O(N) might only be faster say 50.1% of the time.
Big-O becomes meaningful only as N gets huge: if your operations vary in length by say 1000:1, then the difference between an O(N) and O(NlogN) algorithm only becomes dominant as N exceeds 1000 squared (i.e. 1000000). So, Big-O notation is for reasoning about the cost of operations on large sets, in which linear factors like 2x or 10x just aren't considered relevant, and they're ignored.