Find all pairwise differences in an array of distinct integers less than 1e5 - c++

Given an array of distinct positive integers ≤ 105 , I need to find differences of all pairs.
I don't really need to count frequency of every difference, just unique differences.
Using brute force, this can be approached by checking all possible pairs. However, this would not be efficient enough considering the size of array (as all elements are distinct so the maximum size is 105 ). This would lead to O (n2) complexity.
I need to exploit the property of this array that the differences are ≤ 105
So my another approach :
The array elements can be represented using another hash array where the indices representing array elements will be 1 and rest will be 0.
This hash array is represented as a polynomial with all coefficients as 1 and exponents as respective hash values.
Now clone this polynomial and make another polynomial with exponents negated.
If now these polynomials are multiplied, all the positive exponents in the result correspond to differences required.
However this multiplication is something that I am not certain how to efficiently implement. I think FFT can be used as it helps multiply two polynomials in O(n log n) complexity. But it requires positive exponents.
Please provide with suggestions on how to proceed now?
I also came across this algorithm, which uses FFT to find pairwise differences in O(n log n), however I can't understand how the algorithm is working. It seems that it is trying to find all possible sums.
A proof of this algorithm would be appreciated.

Related

If you have an array of fixed size, and you need to go through it !n number of times, how does using binary search change the time complexity?

We want to find all permutations of a string of length n. Then, you are to search an array of fixed constant size, say 3000 and check if the string is in the array.
String arr[3000];
Because we will have !n permutations, we need to do !n searches.
Also, what difference does it make when you check 2 different strings against an element in the array versus just checking 1 string?
What is the time complexity?
My thoughts is that it will take at worst, log2(3000) to go through the array once. Time complexity of that is O(log2(3000)) which is O(1).
Now, you need to go through this array !n times so time complexity is O(!n).
So the binary search reducing the number of searches required should not be the focus when analyzing the time complexity of this algorithm.
My question is, binary search does reduce the number of searches and if you are gonna go through it n! times, shouldn't this be a significant difference?
Any insight to better my understanding is appreciated.
Big O complexity analysis only deals with quantities that are subject to change, by definition. That you get vacuous answers when all your quantities are constant is expected.
The constant factors are relevant when comparing two algorithms of equal Big-O, so your change from 3000 -> log2(3000) is a factor of about 200.
Thus you use the binary search because you are doing more than Big-O analysis. You have also estimated the constant factors, and see an easy 200x speedup
But equally you can have multiple terms in your complexity. You might say:
Let n be the input string length
Let m be the size of arr
Our algorithm is O( n * n! * log(m) ) (n for the string equality, n! for the permutations, log(m) for the binary searching)
It also rather depends on a model of cost. Usually this maps back to some abstract machine, e.g. we assume that operations have a certain cost. E.g. You might compare sorting algorithms by just the count of comparisons, or by just the count of swaps, or by the counts of both comparisons and swaps.

Given a set of positive integers <=k and its n subsets , find union of which pairs of subsets give the original set

I have a set A which consists of first p positive integers (1 to p), and I am given n subsets of this set. How can I find how many pair of subsets on union would give the original set A?
Of course this can be done naively by checking the size of the union of each pair and if it is equal to p , the union must make up the set A, but is there a more elegant way of doing this, which reduces the time complexity?
The set_union in c++ has a time complexity of 2*(size(set 1) + size(set 2)) - 1 which is not good for nC2 pairs.
If we need to cope with a worst-case scenario then some ideas about this problem:
I suppose that using of std::bitset without any optimizations would be sufficient for this task because of the much faster union operation. But if not, don't use variable size vectors, use simple p-length 0-1 arrays/vectors or unordered_sets. I don't think variable size vectors without O(1) find operation would be better in worst-case scenarios.
Use heuristics to minimize subsets unions. The simplest heuristic is checking the sizes of subsets. We need only those pairs (A, B) of subsets where size(A) + size(B) >= p.
In addition to heuristics, we can count (in O(n^2)) the frequencies of appearing of every number in subsets. After that, we can check the presence of the numbers in some subset(s) in frequence-increasing order. Also, we can exclude those numbers that appear in every subset.
If you'll fix some subset A (in the outer loop for example) and will find unions with other subsets, you can check only those numbers that do not appear in set A. If the subset A is large enough this can dramatically reduce the number of operations needed.
Just a possible improvement to your approach, instead of binary searching you can keep a boolean array to find out if some x appears in array i in O(1).
For example,
Let's say, when taking input you save all the appearances for an array i. Meaning, if x appears in array i, then isThere[i][x] should be true else false.
This can save some time.

Checking efficiently if three binary vectors are linearly independent over finite field

I am given three binary vectors v1, v2, v3 represented by unsigned int in my program and a finite field F, which is also a set of binary vectors. I need to check if the vectors are linearly independent that is there are no f1, f2 in F such that f1*v1 +f2*v2 = v3.
The immediate brute force solution is to iterate over the field and check all possible linear combinations.
Does there exist a more efficient algorithm?
I'd like to emphasize two points:
The field elements are vectors, not scalars. Therefor,e a product of a field element f1 and a given vector vi is a dot product. So the Gaussian elimination does not work (if I am not missing something)
the field is finite, so if I find that f1*v1 +f2*v2 = v3 for some f1,f2 it does not mean that f1,f2 belong to F.
If vectors are in r^2, then they are automatically dependent because when we make a matrix of them and reduce it to echelon form, there will be atleast one free variable(in this case only one).
If vectors are in R^3, then you can make a matrix from them i. a 2d array and then you can take determinant of that matrix. If determinant is equal to 0 then vectors are linearly dependent otherwise not.
If vectors are in R^4,R^5 and so on the then the appropriate way is to reduce matrix into echelon form.
For any finite set of M vectors defined in a space of dimension N, they are linearly independent iff the rank of a MxN matrix constructed by stacking these vectors row by row has rank equal to M.
Regarding numerically stable computation involving linear algebra, the singular value decomposition is usually the way to go and there are plenty of implementations available out there. The key point in this context is to realize the rank of a matrix equals the number of its non zero singular values. One must however note, that due to floating point approximations, a finite precision must be chosen to decide whether a value is effectively zero.
Your question mentions your vectors are defined in the set of integers and that certainly can be taken advantage of to overcome the finite precision of floating point computations, but I would not know how. Maybe somebody out there could help us out?
Gaussian elimination does work if you do it inside the finite field.
For binary it should be quite simple, because inverse element is trivial.
For larger finite fields, you will need somehow to find inverse elements, that may turns into a separate problem.

Linear algorithm to find minimum subset sum over a threshold

I have a collection of N positive integers, each bounded by a (relatively small) constant C. I want to find a subset of these numbers with the smallest sum greater than (or equal to) a value K.
The numbers involved aren't terribly large (<100), but I need good performance even in the worst case. I thought maybe I could adapt Pisinger's dynamic programming algorithm to the task; it runs in O(NC) time, and I happen to meet the requirements of bounded, positive numbers.
[Edit]: The numbers are not sorted and there may be duplicates.
However, I don't understand the algorithm well enough to do this myself. In fact, I'm not even certain if the assumptions it is based on still hold...
-Is it possible to adapt this algorithm to my needs?
-Or is there another linear algorithm I could use that is similarly efficient?
-Could anyone provide pseudocode or a detailed explanation?
Thanks.
Link to the Subset-Sum code I was investigating:
Fast solution to Subset sum algorithm by Pisinger
(Apologies if this is poorly worded/formatted/etc. I'm still new to StackOverflow...)
Pisinger's algorithm gives you the largest sum less than or equal to the capacity of the knapsack. To solve your problem, use Pisinger to figure out what not to put in the subset. Formally, let the items be w_1, ..., w_n and the minimum be K. Give w_1, ..., w_n and w_1 + ... + w_n - K to Pisinger, then take every item that Pisinger does not.
Well one solution is to:
T = {0}
for x in V
for t in T
T.insert(x+t)
for i in K to max(T)
if (T.contains(i))
return i
fail
This gives you the size of the subset, but you can adapt to output the members.
The maximum size of T is O(N) (because of C bound), so the running time is O(N^2) and the space is O(N). You can use a bit array of length NC as the backing store of T.

Algorithm to find a duplicate entry in constant space and O(n) time

Given an array of N integer such that only one integer is repeated. Find the repeated integer in O(n) time and constant space. There is no range for the value of integers or the value of N
For example given an array of 6 integers as 23 45 67 87 23 47. The answer is 23
(I hope this covers ambiguous and vague part)
I searched on the net but was unable to find any such question in which range of integers was not fixed.
Also here is an example that answers a similar question to mine but here he created a hash table with the highest integer value in C++.But the cpp does not allow such to create an array with 2^64 element(on a 64-bit computer).
I am sorry I didn't mention it before the array is immutable
Jun Tarui has shown that any duplicate finder using O(log n) space requires at least Ω(log n / log log n) passes, which exceeds linear time. I.e. your question is provably unsolvable even if you allow logarithmic space.
There is an interesting algorithm by Gopalan and Radhakrishnan that finds duplicates in one pass over the input and O((log n)^3) space, which sounds like your best bet a priori.
Radix sort has time complexity O(kn) where k > log_2 n often gets viewed as a constant, albeit a large one. You cannot implement a radix sort in constant space obviously, but you could perhaps reuse your input data's space.
There are numerical tricks if you assume features about the numbers themselves. If almost all numbers between 1 and n are present, then simply add them up and subtract n(n+1)/2. If all the numbers are primes, you could cheat by ignoring the running time of division.
As an aside, there is a well-known lower bound of Ω(log_2(n!)) on comparison sorting, which suggests that google might help you find lower bounds on simple problems like finding duplicates as well.
If the array isn't sorted, you can only do it in O(nlogn).
Some approaches can be found here.
If the range of the integers is bounded, you can perform a counting sort variant in O(n) time. The space complexity is O(k) where k is the upper bound on the integers(*), but that's a constant, so it's O(1).
If the range of the integers is unbounded, then I don't think there's any way to do this, but I'm not an expert at complexity puzzles.
(*) It's O(k) since there's also a constant upper bound on the number of occurrences of each integer, namely 2.
In the case where the entries are bounded by the length of the array, then you can check out Find any one of multiple possible repeated integers in a list and the O(N) time and O(1) space solution.
The generalization you mention is discussed in this follow up question: Algorithm to find a repeated number in a list that may contain any number of repeats and the O(n log^2 n) time and O(1) space solution.
The approach that would come closest to O(N) in time is probably a conventional hash table, where the hash entries are simply the numbers, used as keys. You'd walk through the list, inserting each entry in the hash table, after first checking whether it was already in the table.
Not strictly O(N), however, since hash search/insertion gets slower as the table fills up. And in terms of storage it would be expensive for large lists -- at least 3x and possibly 10-20x the size of the array of numbers.
As was already mentioned by others, I don't see any way to do it in O(n).
However, you can try a probabilistic approach by using a Bloom Filter. It will give you O(n) if you are lucky.
Since extra space is not allowed this can't be done without comparison.The concept of lower bound on the time complexity of comparison sort can be applied here to prove that the problem in its original form can't be solved in O(n) in the worst case.
We can do in linear time o(n) here as well
public class DuplicateInOnePass {
public static void duplicate()
{
int [] ar={6,7,8,8,7,9,9,10};
Arrays.sort(ar);
for (int i =0 ; i <ar.length-1; i++)
{
if (ar[i]==ar[i+1])
System.out.println("Uniqie Elements are" +ar[i]);
}
}
public static void main(String[] args) {
duplicate();
}
}