Finding closest triplet in a set of three vectors? - c++

Given three vectors of double, I want to pair every element in each vector such that the difference between the largest and smallest element in each triple is minimized, and every element of every vector is part of a triple. Right now, I'm using std::lower_bound():
double closest(vector<double> const& vec, double value){ auto const ret = std::lower_bound(vec.begin(), vec.end(), value); return(*ret); }
int main(){
vector<double> a, b, c; vector<vector<double>> triples;
for(auto x : a){
triples.push_back({x, closest(b, x), closest(c, x)});
}
}
Pretend a, b, and c here are populated with some values. The problem is, lower_bound() returns the nearest element not less than the argument. I would also like to consider elements less than the argument. Is there a nice way to to this?

My solution was to implement a binary search terminating in a comparison of neighboring elements. Another possible solution is to iterate over the elements of each vector/array, adjusting the index as necessary to minimize the difference (which may compare to binary search with an ideal complexity of $O(\log{n})$?):
int solve(int A[], int B[], int C[], int i, int j, int k)
{
int min_diff, current_diff, max_term;
min_diff = Integer.MAX_VALUE;
while (i != -1 && j != -1 && k != -1)
{
current_diff = abs(max(A[i], max(B[j], C[k]))
- min(A[i], min(B[j], C[k])));
if (current_diff < min_diff)
min_diff = current_diff;
max_term = max(A[i], max(B[j], C[k]));
if (A[i] == max_term)
i -= 1;
else if (B[j] == max_term)
j -= 1;
else
k -= 1;
}
return min_diff;
}

Related

Maintain an unordered_map but at the same time need the lowest of it's mapped values at every step

I have an unordered_map<int, int> which is updated at every step of a for loop. But at the end of the loop, I also need the lowest of the mapped values. Traversing it to find the minimum in O(n) is too slow. I know there exists MultiIndex container in boost but I can't use boost. What is the simplest way it can be done using only STL?
Question:
Given an array A of positive integers, call a (contiguous, not
necessarily distinct) subarray of A good if the number of different
integers in that subarray is exactly K.
(For example, [1,2,3,1,2] has 3 different integers: 1, 2, and 3.)
Return the number of good subarrays of A.
My code:
class Solution {
public:
int subarraysWithKDistinct(vector<int>& A, int K) {
int left, right;
unordered_map<int, int> M;
for (left = right = 0; right < A.size() && M.size() < K; ++right)
M[A[right]] = right;
if (right == A.size())
return 0;
int smallest, count;
smallest = numeric_limits<int>::max();
for (auto p : M)
smallest = min(smallest, p.second);
count = smallest - left + 1;
for (; right < A.size(); ++right)
{
M[A[right]] = right;
while (M.size() > K)
{
if (M[A[left]] == left)
M.erase(A[left]);
++left;
}
smallest = numeric_limits<int>::max();
for (auto p : M)
smallest = min(smallest, p.second);
count += smallest - left + 1;
}
return count;
}
};
Link to the question: https://leetcode.com/problems/subarrays-with-k-different-integers/
O(n) is not slow, in fact it is the theoretically fastest possible way to find the minimum, as it's obviously not possible to find the minimum of n items without actually considering each of them.
You could update the minimum during the loop, which is trivial if the loop only adds new items to the map but becomes much harder if the loop may change existing items (and may increase the value of the until-then minimum item!), but ultimately, this also adds O(n) amount of work, or more, so complexity-wise, it's not different from doing an extra loop at the end (obviously, the constant can be different - the extra loop may be slower than reusing the original loop, but the complexity is the same).
As you said, there are data structures that make it more efficient (O(log n) or even O(1)) to retrieve the minimum item, but at the cost of increased complexity to maintain this data structure during insertion. These data structures only make sense if you frequently need to check the minimum item while inserting or changing items - not if you only need to know the minimum only at the end of the loop, as you described.
I made a simple class to make it work although it's far from perfect, it's good enough for the above linked question.
class BiMap
{
public:
void insert(int key, int value)
{
auto itr = M.find(key);
if (itr == M.cend())
M.emplace(key, S.insert(value).first);
else
{
S.erase(itr->second);
M[key] = S.insert(value).first;
}
}
void erase(int key)
{
auto itr = M.find(key);
S.erase(itr->second);
M.erase(itr);
}
int operator[] (int key)
{
return *M.find(key)->second;
}
int size()
{
return M.size();
}
int minimum()
{
return *S.cbegin();
}
private:
unordered_map<int, set<int>::const_iterator> M;
set<int> S;
};
class Solution {
public:
int subarraysWithKDistinct(vector<int>& A, int K) {
int left, right;
BiMap M;
for (left = right = 0; right < A.size() && M.size() < K; ++right)
M.insert(A[right], right);
if (right == A.size())
return 0;
int count = M.minimum() - left + 1;
for (; right < A.size(); ++right)
{
M.insert(A[right], right);
while (M.size() > K)
{
if (M[A[left]] == left)
M.erase(A[left]);
++left;
}
count += M.minimum() - left + 1;
}
return count;
}
};

Uninitialized Local Variable 'Quick' Used

I'm making this function which counts the total amount of swaps and comparisons a quick sort function would do in total. When I run it, however, I get this error:
error C4700: uninitialized local variable 'quick' used
This happens in the 'if' statement for the base case listed in the function code below. SwapandComp is the name of the struct I am using to keep track of both the swaps and comparisons for the sorting, and partition is the function where we find where to separate the original array, and it is also where we count the swaps and comparisons.
int partition(int numbers[], int i, int k) {
int l = 0;
int h = 0;
int midpoint = 0;
int pivot = 0;
int temp = 0;
bool done = false;
// Pick middle element as pivot
midpoint = i + (k - i) / 2;
pivot = numbers[midpoint];
l = i;
h = k;
while (!done) {
// Increment l while numbers[l] < pivot
while (numbers[l] < pivot) {
++l;
totalComps++;
}
// Decrement h while pivot < numbers[h]
while (pivot < numbers[h]) {
--h;
totalComps++;
}
// If there are zero or one elements remaining,
// all numbers are partitioned. Return h
if (l >= h) {
totalComps++;
done = true;
}
else {
// Swap numbers[l] and numbers[h],
// update l and h
temp = numbers[l];
numbers[l] = numbers[h];
numbers[h] = temp;
totalSwaps++;
++l;
--h;
}
}
return h;
}
And now here is the quick sort function. As mentioned before, SwapandComp is the struct I used to keep track of both swaps and comparisons.
SwapandComp quicksort(int numbers[], int i, int k) {
SwapandComp quick;
int j = 0;
int z = 0;
// Base case: If there are 1 or zero elements to sort,
// partition is already sorted
if (i >= k) {
return quick;
}
// Partition the data within the array. Value j returned
// from partitioning is location of last element in low partition.
j = partition(numbers, i, k);
// Recursively sort low partition (i to j) and
// high partition (j + 1 to k)
quickSort(numbers, i, j);
quicksort(numbers, j + 1, k);
quick.swaps = totalSwaps;
quick.comps = totalComps;
return quick;
}
On the second line down, I write
SwapandComp quick;
to use for the quick sort struct. The error doesn't really make sense to me because I did declare 'quick' as a new struct to have the function return. Any help is appreciated! Thanks!
Initialize struct as bellow :
SwapandComp quick = { 0 };
SwapandComp quick;
Unless that type has a constructor, declaring a variable with it inside a function will leave it in an indeterminate state. Then returning it (without first initialising it, as per your base case) will cause exactly the issue you're seeing, a "using an uninitialised variable" warning.
You could just initialise the members when declaring it, such as with:
SwapandComp quick; quick.swaps = quick.comps = 0;
But a better way to do it is with a real initialisers, something like:
struct SwapAndComp {
unsigned swaps;
unsigned comps;
SwapAndComp(): swaps(0U) , comps(0U) {};
};
This method (initialisation as part of the class itself) allows you to properly create the structure without any users of it needing to worry about doing it correctly. And, if you want flexibility, you can simply provide a constructor that allows it while still defaulting to the "set to zero" case:
SwapAndComp(unsigned initSwaps = 0U, unsigned initComps = 0U)
: swaps(initSwaps) , comps(initComps) {};

C++: Get K smallest elements+indices from vector with ties

The task is to extract k smallest elements and their indices from double array, possibly including more elements that are tied to the k-th smallest one. E.g.:
input: {3.3,1.1,6.5,4.2,1.1,3.3}
output (k=3): {1,1.1} {4,1.1} {0,3.3} {5,3.3}
[This seems like a pretty common task, but I couldn't find a similar thread on SO - which handles ties. Hopefully, I didn't miss any and didn't duplicate the question.]
I came up with the following solution, which works and seems to be fairly efficient complexity-wise. E.g. for random 1MLN doubles and k=10 it takes ~40ms with MSVC 2013. I wonder if there's a better/cleaner/more efficient(for large data and/or large k) way to perform this task (validations for k value and similar things are our of scope here). Avoid allocating the queue with all elements? Make use of std::partial_sum or std::nth_element?
typedef std::pair<double, int> idx_pair;
typedef std::priority_queue<idx_pair, std::vector<idx_pair>, std::greater<idx_pair>> idx_queue;
std::vector<idx_pair> getKSmallest(std::vector<double> const& data, int k)
{
idx_queue q;
{
std::vector<idx_pair> idxPairs(data.size());
for (auto i = 0; i < data.size(); i++)
idxPairs[i] = idx_pair(data[i], i);
q = idx_queue(std::begin(idxPairs), std::end(idxPairs));
};
std::vector<idx_pair> result;
auto topPop = [&q, &result]()
{
result.push_back(q.top());
q.pop();
};
for (auto i = 0; i < k; i++)
topPop();
auto const largest = result.back().first;
while (q.empty() == false)
{
if (q.top().first == largest)
topPop();
else
break;
}
return result;
}
Working example is here.
Here's an alternative solution, suggested by #piotrekg2 - using nth_element with average O(N) complexity:
bool equal(double value1, double value2)
{
return value1 == value2 || std::abs(value2 - value1) <= std::numeric_limits<double>::epsilon();
}
std::vector<idx_pair> getNSmallest(std::vector<double> const& data, int n)
{
std::vector<idx_pair> idxPairs(data.size());
for (auto i = 0; i < data.size(); i++)
idxPairs[i] = idx_pair(data[i], i);
std::nth_element(std::begin(idxPairs), std::begin(idxPairs) + n, std::end(idxPairs));
std::vector<idx_pair> result(std::begin(idxPairs), std::begin(idxPairs) + n);
auto const largest = result.back().first;
for (auto it = std::begin(idxPairs) + n; it != std::end(idxPairs); ++it)
if (equal(it->first, largest))
result.push_back(*it);
return result;
}
Indeed, the code looks a bit cleaner. However, I've run some tests and empirically this solution is slightly slower than the original one with std::priority_queue.
Note: The answer below by Petar offers a similar solution using std::nth_element, which in my experiments, performs slightly better than this one and also better than the solution using std::priority_queue - perhaps because of eliminating the operation on pairs and working with primitive doubles instead.
As pointed out by asker, I will suggest first copy the vector of double and use a nth_element to find out the kth element.
Then do a linear scan and get the elements that are smaller than or equal to the kth element. The Time complexity should be linear.
However, it should be careful when comparing double.
vector<idx_pair> getKSmallest(vector<double> const& data, int k){
vector<double> data_copy = data;
nth_element(data_copy.begin(), data_copy.begin() + k, data_copy.end());
vector<idx_pair> result;
double kth_element = data_copy[k - 1];
for (int i = 0; i < data.size(); i++)
if (data[i] <= kth_element)
result.push_back({i, data[i]});
return result;
}
update: It is also possible to find the kth_element by maintaing a max heap with size at most k.
It only need O(k) memory for heap instead of O(n) memory in the nth_element method.
It needs O(n log k) time but if k is small then i think it should be comparable to O(n) method.
I am not sure about it but my reason are the heap may be cached and you don't need to spend time for copying data.
vector<idx_pair> getKSmallest(vector<double> const& data, int k)
{
priority_queue<double> pq;
for (auto d : data){
if (pq.size() >= k && pq.top() > d){
pq.push(d)
pq.pop();
}
else if (pq.size() < k)
pq.push(d);
}
double kth_element = pq.top();
vector<idx_pair> result;
for (int i = 0; i < data.size(); i++)
if (data[i] <= kth_element)
result.push_back({i, data[i]});
return result;
}

Counting the Number of Element Comparisons in Quick Sort

I have been provided this predefined code for Quick Sort which isn't to be altered much:
I know we already have questions on this but this is different as the logic is predefined here.
void quicksort(int a[], int l, int r)
{
if (r <= l) return;
/* call for partition function that you modify */
quicksort(a, l, i-1);
quicksort(a, i+1, r);
}
int partition(int a[], int l, int r)
{ int i = l-1, j = r; int v = a[r];
for (;;)
{
while (a[++i] < v) ;
while (v < a[--j]) if (j == l) break;
if (i >= j) break;
exch(a[i], a[j]);
}
exch(a[i], a[r]);
return i;
}
We are just required to make slight modification so that the quicksort returns the number of comparisons that it, and partition function together (in total) have performed in sorting the given array a. **In these comparisons, only the comparisons that involve array elements are counted. You are not allowed to use any global variable in counting these comparisons. **
I have implemented this as follows, kindly let me know if I'm mistaken somewhere:
int partition(int a[], int l, int r, int& count) {
int i = l - 1, j = r; int v = a[r];
for (;;) {
while (a[++i] < v) count++;
while (v < a[--j]) {
count++;
if (j == l) break;
}
if (i >= j) break;
swap(a[i], a[j]);
}
swap(a[i], a[r]);
return i;
}
int quickSort(int a[], int l, int r) {
int count = 0;
if (r <= l) return 0;
int i = partition(a, l, r, count);
return count + quickSort(a, l, i - 1) + quickSort(a, i + 1, r);
}
Once you confirm, I'm going to share a surprising result of my research on this with you.
ricis comment works fine as a solution. There is an alternate approach one might take that can be generalized to the std::sort and other algorithms, and that is to make a counting comparer.
Something along the lines of
struct CountingComparer{
CountingComparer():count(0){}
CountingComparer(const CountingComparer& cc):count(cc.count){}
bool operator()(int lhs, int rhs){
count;
return lhs < rhs;
}
size_t count;
};
Now you need to change the signature of your function to add the comparer as a last argument. Like
template<typename COMP>
void quicksort( .... , COMP comp){
And the same change to the partition function. The comparisons are then made by
while (comp(a[++i],v)) and
while (comp(v, a[--j])) respectively.
Calling the sort
You need to make sure you have a reference to your comparer in the template argument.
CountingComparer comp;
quicksort(... , std::ref(comp));
Makes sure comp is not copied.
After the sort you find the number of comparisons in comp.count.
regarding your comment on counts
Your quicksort's behaviour is extensively discussed on the wikipedia page. It is expected that a sorted array behaves badly, while random elements behaves well.
Regarding the cleverness in the partition function
for (;;)
{
while (a[++i] < v) ;
while (v < a[--j]) if (j == l) break;
if (i >= j) break;
exch(a[i], a[j]);
}
exch(a[i], a[r]);
The first for statement isn't really counting anything so thats just a while(true) in disguise. It will end by a break statement.
Find the first large element to swap: The while (a[++i] < v); statement takes advantage of the fact that the pivot `v or a[r]' element is the rightmost element. So the pivot element acts like a guard here.
Find the first small element to swap: The while (v < a[--j]) if (j == l) break; does not have the guarantee of the pivot. Instead it checks for the leftmost limit.
The final check is just to see if the partition is done. If so break out of the infinite loop and finally
swap(a[i], a[r]);, arrange the pivot element to its correct position.

Blueberries (SPOJ) - Dynamic Programming Time Limit Exceeded

I was solving a problem on spoj. Problem has a simple recursive solution.
Problem: Given an array of numbers of size n, select a set of numbers such that no two elements in the set are consecutive and sum of subset elements will be as close as possible to k, but should not exceed it.
My Recursive Approach
I used a approach similar to knapsack, at dividing the problem such that one includes the current element and other ignores it.
function solve_recursively(n, current, k)
if n < 0
return current
if n == 0
if current + a[n] <= k
return current + a[n]
else
return current
if current + a[n] > k
return recurse(n-1, current, k)
else
return max(recurse(n-1, current, k), recurse(n-2, current+a[n], k))
Later as it is exponential in nature, I used map (in C++) to do memoization to reduce complexity.
My source code:
struct k{
int n;
int curr;
};
bool operator < (const struct k& lhs, const struct k& rhs){
if(lhs.n != rhs.n)
return lhs.n < rhs.n;
return lhs.curr < rhs.curr;
};
int a[1001];
map<struct k,int> dp;
int recurse(int n, int k, int curr){
if(n < 0)
return curr;
struct k key = {n, curr};
if(n == 0)
return curr + a[0] <= k ? curr + a[0] : curr;
else if(dp.count(key))
return dp[key];
else if(curr + a[n] > k){
dp[key] = recurse(n-1, k, curr);
return dp[key];
}
else{
dp[key] = max(recurse(n-1, k, curr), recurse(n-2, k, curr+a[n]));
return dp[key];
}
}
int main(){
int t,n,k;
scanint(t);
while(t--){
scanint(n);
scanint(k);
for(int i = 0; i<n; ++i)
scanint(a[i]);
dp.clear();
printf("Scenario #%d: %d\n",j, recurse(n-1, k, 0));
}
return 0;
}
I checked for given test cases. It cleared them. But I am getting wrong answer on submission.
EDIT: Earlier my output format was wrong, so I was getting Wrong Answer. But, now its showing Time Limit Exceeded. I think bottom-up approach would be helpful, but I am having problem in formulating one. I am approaching it as bottom-up knapsack, but having some difficulties in exact formulation.
To my understanding, you almost have the solution. If the recurrence relation is correct but too inefficient, you just have change the recursion to iteration. Apparently, you already have the array dp which represents the states and their respective values. Basically you should be able to solve fill dp with three nested loops for n, k and curr, which would increase respectively to ensure that each value from dp which is needed has already been calculated. Then you replace the recursive calls to recurse with accesses to dp.