Algorithm on List and Maximum Product - c++

a) with sequence X=(x1,x2,...,xn) of positive real numbers, we can find a sub-sequence that elements in this sub-sequence has a maximum product in O(n).
b) with an algorithm of order O(n) we can merge m=sqrt(n) sorted sequences that in whole we have n elements.
why my professor say these two sentence is false?
i read an O(n) algorithms for (a):
http://www.geeksforgeeks.org/maximum-product-subarray/
anyone can help me?

I don't know about the first statement, but the second statement can be said false by the following argument:
Since there are sqrt(n) sequences, with n elements each, the total number of elements in n*sqrt(n). In the worst case, you would need to check every element at least once to merge them all into a single list, and this would put the time complexity at at least n*sqrt(n). If there are sqrt(n) elements in each sequence, please read the edit.
I'm not really sure about the first one because the algorithm provided by you is for integers, while we are dealing with reals in your case.
EDIT: A merging algorithm for k sorted arrays and n total elements puts the time complexity at O(n*log(k)). Even if every sequence has sqrt(n) elements (as opposed to n each, as assumed in the previous paragraph) the time complexity would still be O(n*(log(sqrt(n)))).

Related

To find sum of all consecutive sub-array of length k in a given array

I want to find out all the sum of continuous sub-array of length K
for a given array of Length n given that k < n. For example, let the given array be arr[6]={1,2,3,4,5,6} and k=3,then answer is (6,9,12,15).
It can be obtained as :
(1+2+3)=6,
(2+3+4)=9,
(3+4+5)=12,
(4+5+6)=15.
I have tried this using sliding window of length k,but its time complexity is O(n).Is any solution which takes even less time such as O(log n).
Unless you know certain specific properties of the array (e.g. the ordering of the elements, the range of the elements included in the array, etc.) then you would need to check each individual value, resulting in an O(n) complexity.
If, for instance, you knew that the sum of the values in the array were T (perhaps because you knew T itself or were given the range) then you could consider that all the elements except the first and last (K-1) elements would be included in K different sums. This would mean a sum of T.K minus some amount, and you could reduce the values of the first and last K values appropriate amount of times, resulting in an algorithm of complexity O(K).
But note that, in order to achieve a strategy similar to this, you would have to know some other specific information regarding the values in the array, may that be their range or their sum.
You can use Segment tree data structure, though building of it will take O(n log n), but than you can find sum of any interval in O( log n ), and modify each element of array in O( log n )
https://en.wikipedia.org/wiki/Segment_tree

how to select least N elements with limited space?

The problem:
A function f returns elements one at a time in an unknown order. I want to select the least N elements. Function f is called many times (I'm searching through a very complex search space) and I don't have enough memory to store every output element for the future sorting.
The obvious solution:
Keep a vector of N elements in the memory and on each f() search for minimum and maximum and possibly replace something. This would probably work for very small N well. I'm looking for more general solution, though.
My solution so far:
I though about using priority_queue in order to store let's say 2N values and reducing the upper half after each 2N steps.
Pseudocode:
while (search goes on)
for (i=0..2N)
el = f()
pust el to the priority queue
remove N greatest elements from the priority queue
select N least elements from the priority queue
I think this should work, however, I don't find it elegant at all. Maybe there is already some kind of data structure that handles this problem. It would be really nice just to modify the priority_queue in order to throw away the elements that don't fit into the saved range.
Could you recommend me an existing std data structure for C++ or encourage me to implement the solution I suggested above? Or maybe there is some great and elegant trick that I can't think of.
You want to find least n elements on total K elements got from calling a function. Each time you call function f() you get one element and you want to store least n elements among them without storing total k elements got from the function since k is too big.
You can define a heap or priority_queue to store this least n found so far. Just add the returned item from f() to the pq and pop the greatest element if its size became n+1.
Total complexity would be O(K*log(n)) and space needed would be O(n). (If we ignore some extra space required by pq)
Alternate option would be to use an array. Depending on the maximum allowed elements compared to N, there are two options I can think of:
Make the array as big as possible and unsorted, periodically retrieve the smallest elements.
Have an array of size N, sorted with max elements on the end.
Option 1 would have you sort the array with O(n log n) time every time you fill up the array. That would happen for each n - N elements (except the first time), yielding (k - n) / (n - N) sorts, resulting in O((k - n) / (n - N) n log n) time complexity for k total elements, n elements in the array, N elements to be selected. So for n = 2N, you get O(2*(k - 2N) log 2N) time complexity if I'm not mistaken.
Option 2 would have you keep the array (sized N) sorted with maximum elements at the end. Each time you get an element, you can quickly (O(1)) see if it is smaller than the last one. Using binary search, you can find the right spot for the element in O(log N) time. However, you now need to move all the elements after the new element one place right. That takes O(N) time. So you end up with theoretical O(k*N) time complexity. Given that computers like working with homogenous data accesses however (caches and stuff), this might be faster than heap, even if it is array-backed.
If your elements are big, you might be better off having a structure of { coparison_value; actual_element_pointer } even if you are using heap (unless it is list-backed).

Search Algorithm to find the k lowest values in a list

I have a list that contains n double values and I need to find the k lowest double values in that list
k is much smaller than n
the initial list with the n double values is randomly ordered
the found k lowest double values are not required to be sorted
What algorithm would you recommend?
At the moment I use Quicksort to sort the whole list, and then I take the first k elements out of the sorted list. I expect there should be a much faster algorithm.
Thank you for your help!!!
You could model your solution to match the nlargest() code in Python's standard library.
Heapify the first k values on a maxheap.
Iterate over the remaining n - k values.
Compare each to the element of the top of the heap.
If the new value is lower, do a heapreplace operation (which replaces the topmost heap element with the new value and then sifts it downward).
The algorithm can be surprisingly efficient. For example, when n=100,000 and k=100, the number of comparisons is typically around 106,000 for randomly arranged inputs. This is only slightly more than 100,000 comparisons to find a single minimum value. And, it does about twenty times fewer comparisons than a full quicksort on the whole dataset.
The relative strength of various algorithms is studied and summarized at: http://code.activestate.com/recipes/577573-compare-algorithms-for-heapqsmallest
You can use selection algorithm to find the kth lowest element and then iterate and return it and all elements that are lower then it. More work has to be done if the list can contain duplicates (making sure you don't end up with more elements that you need).
This solution is O(n).
Selection algorithm is implemented in C++ as nth_element()
Another alternative is to use a max heap of size k, and iterate the elements while maintaining the heap to hold all k smallest elements.
for each element x:
if (heap.size() < k):
heap.add(x)
else if x < heap.max():
heap.pop()
heap.add(x)
When you are done - the heap contains k smallest elements.
This solution is O(nlogk)
Take a look at partial_sort algorithm from C++ standard library.
You can use std::nth_element. This is O(N) complexity because it doesn't sort the elements, it just arranges them such that every element under a certain N is less than N.
you can use selection sort, it takes O(n) to select first lowest value. Once we have set this lowest value on position 1 we can rescan the data set to find out second lowest value. and can do it until we have kth lowest value. in this way if k is enough smaller then n then we will have complexity kn which is equivalent to O(n)...

Algorithm to find a duplicate entry in constant space and O(n) time

Given an array of N integer such that only one integer is repeated. Find the repeated integer in O(n) time and constant space. There is no range for the value of integers or the value of N
For example given an array of 6 integers as 23 45 67 87 23 47. The answer is 23
(I hope this covers ambiguous and vague part)
I searched on the net but was unable to find any such question in which range of integers was not fixed.
Also here is an example that answers a similar question to mine but here he created a hash table with the highest integer value in C++.But the cpp does not allow such to create an array with 2^64 element(on a 64-bit computer).
I am sorry I didn't mention it before the array is immutable
Jun Tarui has shown that any duplicate finder using O(log n) space requires at least Ω(log n / log log n) passes, which exceeds linear time. I.e. your question is provably unsolvable even if you allow logarithmic space.
There is an interesting algorithm by Gopalan and Radhakrishnan that finds duplicates in one pass over the input and O((log n)^3) space, which sounds like your best bet a priori.
Radix sort has time complexity O(kn) where k > log_2 n often gets viewed as a constant, albeit a large one. You cannot implement a radix sort in constant space obviously, but you could perhaps reuse your input data's space.
There are numerical tricks if you assume features about the numbers themselves. If almost all numbers between 1 and n are present, then simply add them up and subtract n(n+1)/2. If all the numbers are primes, you could cheat by ignoring the running time of division.
As an aside, there is a well-known lower bound of Ω(log_2(n!)) on comparison sorting, which suggests that google might help you find lower bounds on simple problems like finding duplicates as well.
If the array isn't sorted, you can only do it in O(nlogn).
Some approaches can be found here.
If the range of the integers is bounded, you can perform a counting sort variant in O(n) time. The space complexity is O(k) where k is the upper bound on the integers(*), but that's a constant, so it's O(1).
If the range of the integers is unbounded, then I don't think there's any way to do this, but I'm not an expert at complexity puzzles.
(*) It's O(k) since there's also a constant upper bound on the number of occurrences of each integer, namely 2.
In the case where the entries are bounded by the length of the array, then you can check out Find any one of multiple possible repeated integers in a list and the O(N) time and O(1) space solution.
The generalization you mention is discussed in this follow up question: Algorithm to find a repeated number in a list that may contain any number of repeats and the O(n log^2 n) time and O(1) space solution.
The approach that would come closest to O(N) in time is probably a conventional hash table, where the hash entries are simply the numbers, used as keys. You'd walk through the list, inserting each entry in the hash table, after first checking whether it was already in the table.
Not strictly O(N), however, since hash search/insertion gets slower as the table fills up. And in terms of storage it would be expensive for large lists -- at least 3x and possibly 10-20x the size of the array of numbers.
As was already mentioned by others, I don't see any way to do it in O(n).
However, you can try a probabilistic approach by using a Bloom Filter. It will give you O(n) if you are lucky.
Since extra space is not allowed this can't be done without comparison.The concept of lower bound on the time complexity of comparison sort can be applied here to prove that the problem in its original form can't be solved in O(n) in the worst case.
We can do in linear time o(n) here as well
public class DuplicateInOnePass {
public static void duplicate()
{
int [] ar={6,7,8,8,7,9,9,10};
Arrays.sort(ar);
for (int i =0 ; i <ar.length-1; i++)
{
if (ar[i]==ar[i+1])
System.out.println("Uniqie Elements are" +ar[i]);
}
}
public static void main(String[] args) {
duplicate();
}
}

The amortized complexity of std::next_permutation?

I just read this other question about the complexity of next_permutation and while I'm satisfied with the response (O(n)), it seems like the algorithm might have a nice amortized analysis that shows a lower complexity. Does anyone know of such an analysis?
So looks like I'm going to be answering my own question in the affirmative - yes, next_permutation runs in O(1) amortized time.
Before I go into a formal proof of this, here's a quick refresher on how the algorithm works. First, it scans backwards from the end of the range toward the beginning, identifying the longest contiguous decreasing subsequence in the range that ends at the last element. For example, in 0 3 4 2 1, the algorithm would identify 4 2 1 as this subsequence. Next, it looks at the element right before this subsequence (in the above example, 3), then finds the smallest element in the subsequence larger than it (in the above example, 4). Then, it exchanges the positions of those two elements and then reverses the identified sequence. So, if we started with 0 3 4 2 1, we'd swap the 3 and 4 to yield 0 4 3 2 1, and would then reverse the last three elements to yield 0 4 1 2 3.
To show that this algorithm runs in amortized O(1), we'll use the potential method. Define Φ to be three times the length of the longest contiguously decreasing subsequence at the end of the sequence. In this analysis, we will assume that all the elements are distinct. Given this, let's think about the runtime of this algorithm. Suppose that we scan backwards from the end of the sequence and find that the last m elements are part of the decreasing sequence. This requires m + 1 comparisons. Next, we find, of the elements of that sequence, which one is the smallest larger than the element preceding this sequence. This takes in the worst case time proportional to the length of the decreasing sequence using a linear scan for another m comparisons. Swapping the elements takes, say, 1 credit's worth of time, and reversing the sequence then requires at most m more operations. Thus the real runtime of this step is roughly 3m + 1. However, we have to factor in the change in potential. After we reverse this sequence of length m, we end up reducing the length of the longest decreasing sequence at the end of the range to be length 1, because reversing the decreasing sequence at the end makes the last elements of the range sorted in ascending order. This means that our potential changed from Φ = 3m to Φ' = 3 * 1 = 3. Consequently, the net drop in potential is 3 - 3m, so our net amortized time is 3m + 1 + (3 - 3m) = 4 = O(1).
In the preceding analysis I made the simplifying assumption that all the values are unique. To the best of my knowledge, this assumption is necessary in order for this proof to work. I'm going to think this over and see if the proof can be modified to work in the case where the elements can contain duplicates, and I'll post an edit to this answer once I've worked through the details.
I am not really sure of the exact implementation of std::next_permutation, but if it is the same as Narayana Pandita's algorithm as desribed in the wiki here: http://en.wikipedia.org/wiki/Permutation#Systematic_generation_of_all_permutations,
assuming the elements are distinct, looks like it is O(1) amortized! (Of course, there might be errors in the below)
Let us count the total number of swaps done.
We get the recurrence relation
T(n+1) = (n+1)T(n) + Θ(n2)
(n+1)T(n) comes from fixing the first element and doing the swaps for the remaining n.
Θ(n2) comes from changing the first element. At the point we change the first element, we do Θ(n) swaps. Do that n times, you get Θ(n2).
Now let X(n) = T(n)/n!
Then we get
X(n+1) = X(n) + Θ(n2)/(n+1)!
i.e there is some constant C such that
X(n+1) <= X(n) + Cn2/(n+1)!
Writing down n such inequalities gives us
X(n+1) - X(n) <= Cn2/(n+1)!
X(n) - X(n-1) <= C(n-1)2/(n)!
X(n-1) - X(n-2) <= C(n-2)2/(n-1)!
...
X(2) - X(1) <= C12/(1+1)!
Adding these up gives us X(n+1) - X(1) <= C(\sum j = 1 to n (j^2)/(j+1)!).
Since the infinite series \sum j = 1 to infinity j^2/(j+1)! converges to C', say, we get X(n+1) - X(1) <= CC'
Remember that X(n) counts the average number of swaps needed (T(n)/n!)
Thus the average number of swaps is O(1).
Since finding the elements to swap is linear with the number of swaps, it is O(1) amortized even if you take other operations into consideration.
Here n stands for the count of elements in the container, not the total count of possible permutations. The algorithm must iterate through an order of all elements at each call; it takes a pair of bidirectional iterators, which implies that to get to one element the algorithm must first visit the one before it (unless its the first or last element). A bidirectional iterator allows iterating backwards, so the algorithm can (must, in fact) perform half as many swaps as there are elements. I believe the standard could offer an overload for a forward iterator, which would support dumber iterators at the cost of n swaps rather than half n swaps. But alas, it didn't.
Of course, for n possible permutations the algorithm operates in O(1).