Subtract vectors, that contain duplicated elements - c++

Is there any elegant way to subtract std::vectors, that contain duplicated elements?
Example:
v1 = { 3, 1, 2, 1, 2, 2 }
v2 = { 2, 4, 3, 3, 3 }
result1 = ??( v1, v2 )
result2 = ??( v2, v1 )
and I want the result to be:
result1 = { 1, 1 }
result2 = { 4 }
My current (and very slow) solution:
1) sort v1 and v2
2) use std::unique_copy to v1_uniq, v2_uniq
3) intersect the new vectors with std::set_intersection
4) iterate over v1 and v2 and remove all elements, that are in the intersection 3)
My other idea is:
1) sort v1 and v2
2) iterate over v1 and v2 and remove duplicates in parallel
But this is kinda error-prone doesn't look elegant to me.
Any other ideas?

You could use std::copy_if with a unary predicate that checks whether the element is in the second vector. Or, if you don't have C++11 support, use std::remove_copy_if with the predicate's logic suitably changed.
For the unary predicate:
struct Foo {
Foo(const std::vector& v) : v_(v) {}
bool operator() (int i) const {
// return true if i is in v_
}
const std::vector<int>& v_;
};
which can be instantiated like this:
Foo f(v2);
You could modify the functor to keep a sorted version of the reference vector, with unique entries to allow to do a binary search, but the general idea is the same.

I have a rather simple algorithm, which complexity is O(n²). However, it can be faster with a sort (O(n log n)). Here it is:
substract s from v
for all elements of v
for all elements of s
if element i-th of v == element j-th of s
then remove it from v and break the loop on s
With other structures, maybe it could be faster. For example, if elements were shared, you could detach all elements of v that are shared with s, with a O(n) complexity.

Related

Is it possible to make a vector of ranges in cpp20

Let's say I have a a vector<vector<int>>. I want to use ranges::transform in such a way that I get
vector<vector<int>> original_vectors;
using T = decltype(ranges::views::transform(original_vectors[0], [&](int x){
return x;
}));
vector<int> transformation_coeff;
vector<T> transformed_vectors;
for(int i=0;i<n;i++){
transformed_vectors.push_back(ranges::views::transform(original_vectors[i], [&](int x){
return x * transformation_coeff[i];
}));
}
Is such a transformation, or something similar, currently possible in C++?
I know its possible to simply store the transformation_coeff, but it's inconvenient to apply it at every step. (This will be repeated multiple times so it needs to be done in O(log n), therefore I can't explicitly apply the transformation).
Yes, you can have a vector of ranges. The problem in your code is that you are using a temporary lambda in your using statement. Because of that, the type of the item you are pushing into the vector later is different from T. You can solve it by assigning the lambda to a variable first:
vector<vector<int>> original_vectors;
auto lambda = [&](int x){return x;};
using T = decltype(ranges::views::transform(original_vectors[0], lambda));
vector<T> transformed_vectors;
transformed_vectors.push_back(ranges::views::transform(original_vectors[0], lambda));
It is not possible in general to store different ranges in a homogeneous collection like std::vector, because different ranges usually have different types, especially if transforms using lambdas are involved. No two lambdas have the same type and the type of the lambda will be part of the range type. If the signatures of the functions you want to pass to the transform are the same, you could wrap the lambdas in std::function as suggested by #IlCapitano (https://godbolt.org/z/zGETzG4xW). Note that this comes at the cost of the additional overhead std::function entails.
A better option might be to create a range of ranges.
If I understand you correctly, you have a vector of n vectors, e.g.
std::vector<std::vector<int>> original_vector = {
{1, 5, 10},
{2, 4, 8},
{5, 10, 15}
};
and a vector of n coefficients, e.g.
std::vector<int> transformation_coeff = {2, 1, 3};
and you want a range of ranges representing the transformed vectors, where the ith range represents the ith vector's elements which have been multiplied by the ith coefficient:
{
{ 2, 10, 20}, // {1, 5, 10} * 2
{ 2, 4, 8}, // {2, 4, 8} * 1
{15, 30, 45} // {5, 10, 15} * 3
}
Did I understand you correctly? If yes, I don't understand what you mean with your complexity requirement of O(log n). What does n refer to in this scenario? How would this calculation be possible in less than n steps? Here is a solution that gives you the range of ranges you want. Evaluating this range requires O(n*m) multiplications, where m is an upper bound for the number of elements in each inner vector. I don't think it can be done in less steps because you have to multiply each element in original_vector once. Of course, you can always just evaluate part of the range, because the evaluation is lazy.
C++20
The strategy is to first create a range for the transformed i-th vector given the index i. Then you can create a range of ints using std::views::iota and transform it to the inner ranges:
auto transformed_ranges = std::views::iota(0) | std::views::transform(
[=](int i){
// get a range containing only the ith inner range
auto ith = original_vector | std::views::drop(i) | std::views::take(1) | std::views::join;
// transform the ith inner range
return ith | std::views::transform(
[=](auto const& x){
return x * transformation_coeff[i];
}
);
}
);
You can now do
for (auto const& transformed_range : transformed_ranges){
for (auto const& val : transformed_range){
std::cout << val << " ";
}
std::cout<<"\n";
}
Output:
2 10 20
2 4 8
15 30 45
Full Code on Godbolt Compiler Explorer
C++23
This is the perfect job for C++23's std::views::zip_transform:
auto transformed_ranges = std::views::zip_transform(
[=](auto const& ith, auto const& coeff){
return ith | std::views::transform(
[=](auto const& x){
return x * coeff;
}
);
},
original_vector,
transformation_coeff
);
It's a bit shorter and has the added benefit that transformation_coeff is treated as a range as well:
It is more general, because we are not restricted to std::vectors
In the C++20 solution you get undefined behaviour without additional size checking if transformation_coeff.size() < original_vector.size() because we are indexing into the vector, while the C++23 solution would just return a range with fewer elements.
Full Code on Godbold Compiler Explorer

Can you access the current iterator from a function used by the transform function in c++?

Can you access the current iterator from a function used by the transform function in c++ so that you can reference previous and latter values?
I want to use the transform function to iterate through a vector, performing operations on the vector that rely on values before and after the current value.
For example, say I have a vector with values [1,2,4,3,5,6], and I want to start at the second value, and iterate until the second to last value. On each of those elements, I want to make a new value that equals the sum of the value, and the values next to it in the original.
The ending vector would look like
[7,9,12,14].
auto originalsBeginIterator = originalPoints.begin();
auto originalsEndIterator = originalPoints.end();
std::advance(originalsBeginIterator, 1);
std::advance(originalsEndIterator,-1);
std::transform(originalsBeginIterator,originalsEndIterator,alteredValues.begin(),
[](int x) x = { return {previous value} + x + {next value};}
);
Is there any way to reference previous and latter values from the original array when using transform?
Clearly the tool std::transform simply doesn't give you a way to do that: it either takes a unary predicate to be applied to individual elements of of one collection, or a binary predicate to be applied to corresponding elements of two collections.
But the point is that, from the functional programming perspective, what you are trying to do is simply not a transform.
How can you go about it instead? You could zip that vector, let's call it v, the same vector deprived of its first element, and the same vector deprived from its second element; you would then sum the 3 elements of each pair.
Range-v3 gives you a way to do this quite tersely:
#include <iostream>
#include <range/v3/view/drop.hpp>
#include <range/v3/view/transform.hpp>
#include <range/v3/view/zip_with.hpp>
#include <vector>
using namespace ranges::views;
int main()
{
// input
std::vector<int> v{1,2,3,4,5,6};
// to sum 3 ints
auto constexpr plus = [](int x, int y, int z){ return x + y + z; };
// one-liner
auto w = zip_with(plus, v, v | drop(1), v | drop(2));
// output
std::cout << w << std::endl;
}
v | drop(1) gives you a view on the elements {2,3,4,5,6}, and v | drop(2) on {3,4,5,6}; zip_with taks a n-ary function and n ranges and combines the n-tuple of corresponding elements from the n ranges using the n-ary function. So in our case it'll go like this:
v = {1, 2, 3, 4, 5, 6}
+ + + +
v1 = v | drop(1) = {2, 3, 4, 5, 6}
+ + + +
v2 = v | drop(2) = {3, 4, 5, 6}
zip_with(plus, v, v1, v2) = {6, 9,12,15}

How to find std::max_element on std::vector<std::pair<int, int>> in either of the axis?

How do i find the max element in this pair std::vector<std::pair<int, int>> in either of the axis.
Let this be the sample pair:
0, 1
0, 2
1, 1
1, 2
1, 4
2, 2
3, 1
I tried using std::minmax_element():
const auto p = std::minmax_element(edges.begin(), edges.end());
auto max = p.second->first;
But this generates the max element only of the first column i.e 3, but i want the max element of either the columns i.e 4.
I want the max element to be the highest element of either the columns.
Use std::max_element with a custom compare function, something like:
auto max_pair = *std::max_element(std::begin(edges), std::end(edges),
[](const auto& p1, const auto& p2) {
return std::max(p1.first, p1.second) < std::max(p2.first, p2.second);
});
int max = std::max(max_pair.first, max_pair.second);
You need provide predicate which will define "less" relation for your items:
const auto p = std::minmax_element(
edges.begin(), edges.end(),
[](const auto& a, const auto& b) {
// provide relation less you need, example:
return std::max(a.first, a.second) < std::max(b.first, b.second);
});
By default (in your code) less operator is used. For std::pair it works in lexicographical ordering of elements (if first elements are less returns true if they are equal checks second elements if the are less).

Fast union building of multiple vectors in C++

I’m searching for a fast way to build a union of multiple vectors in C++.
More specifically: I have a collection of vectors (usually 15-20 vectors with several thousand unsigned integers; always sorted and unique so they could also be an std::set). For each stage, I choose some (usually 5-10) of them and build a union vector. Than I save the length of the union vector and choose some other vectors. This will be done for several thousand times. In the end I'm only interested in the length of the shortest union vector.
Small example:
V1: {0, 4, 19, 40}
V2: {2, 4, 8, 9, 19}
V3: {0, 1, 2, 4, 40}
V4: {9, 10}
// The Input Vectors V1, V2 … are always sorted and unique (could also be an std::set)
Choose V1 , V3;
Union Vector = {0, 1, 2, 4, 19, 40} -> Size = 6;
Choose V1, V4;
Union Vector = {0,4, 9, 10, 19 ,40} -> Size = 6;
… and so on …
At the moment I use std::set_union but I’m sure there must be a faster way.
vector< vector<uint64_t>> collection;
vector<uint64_t> chosen;
for(unsigned int i = 0; i<chosen->size(); i++) {
set_union(collection.at(choosen.at(i)).begin(),
collection.at(choosen.at(i)).end(),
unionVector.begin(),
unionVector.end(),
back_inserter(unionVectorTmp));
unionVector.swap(unionVectorTmp);
unionVectorTmp.clear();
}
I'm grateful for every reference.
EDIT 27.04.2017
A new Idea:
unordered_set<unsigned int> unionSet;
unsigned int counter = 0;
for(const auto &sel : selection){
for(const auto &val : sel){
auto r = unionSet.insert(val);
if(r.second){
counter++;
}
}
}
If they're sorted you can roll your own thats O(N+M) in runtime. Otherwise you can use a hashtable with similar runtime
The de facto way in C++98 is set_intersection, but with c++11 (or TR1) you can go for unordered_set, provided the initial vector is sorted, you will have a nice O(N) algorithm.
Construct an unordered_set out of your first vector
Check if the elements of your 2nd vector are in the set
Something like that will do:
std::unordered_set<int> us(std::begin(v1), std::end(v1));
auto res = std::count_if(std::begin(v2), std::end(v2), [&](int n) {return us.find(n) != std::end(us);}
There's no need to create the entire union vector. You can count the number of unique elements among the selected vectors by keeping a list of iterators and comparing/incrementing them appropriately.
Here's the pseudo-code:
int countUnique(const std::vector<std::vector<unsigned int>>& selection)
{
std::vector<std::vector<unsigned int>::const_iterator> iters;
for (const auto& sel : selection) {
iters.push_back(sel.begin());
}
auto atEnd = [&]() -> bool {
// check if all iterators equal end
};
int count = 0;
while (!atEnd()) {
const int min = 0; // find minimum value among iterators
for (size_t i = 0; i < iters.size(); ++i) {
if (iters[i] != selection[i].end() && *iters[i] == min) {
++iters[i];
}
}
++count;
}
return count;
}
This uses the fact that your input vectors are sorted and only contain unique elements.
The idea is to keep an iterator into each selected vector. The minimum value among those iterators is our next unique value in the union vector. Then we increment all iterators whose value is equal to that minimum. We repeat this until all iterators are at the end of the selected vectors.

How to apply the intersection between two lists in C++?

I am new to C++ list.
I have two lists: list1 and list2. I need to get common elements between these lists. How can I get this?
You can use: std::set_intersection for that, provided you first sort the two lists:
Example:
#include <algorithm>
#include <iostream>
#include <list>
int main() {
std::list<int> list1{2, 5, 7, 8, -3, 7};
std::list<int> list2{9, 1, 6, 3, 5, 2, 11, 0};
list1.sort();
list2.sort();
std::list<int> out;
std::set_intersection(list1.begin(), list1.end(), list2.begin(), list2.end(),
std::back_inserter(out));
for(auto k : out)
std::cout << k << ' ';
}
Output:
2 5
EDIT:
The above method is likely not going to be optimal, mostly because sorting a std::list isn't nice to the CPU...
For a trade-off of space, the method below will certainly be faster for larger sets of data, because we iterate through each list only once, and all operations done at each iteration doesn't go beyond a O(1) amortized complexity
template<typename T>
std::list<T> intersection_of(const std::list<T>& a, const std::list<T>& b){
std::list<T> rtn;
std::unordered_multiset<T> st;
std::for_each(a.begin(), a.end(), [&st](const T& k){ st.insert(k); });
std::for_each(b.begin(), b.end(),
[&st, &rtn](const T& k){
auto iter = st.find(k);
if(iter != st.end()){
rtn.push_back(k);
st.erase(iter);
}
}
);
return rtn;
}
I used std::unordered_multiset rather than std::unordered_set because it preserves the occurences of common duplicates in both lists
I ran a dirty benchmark for the two methods on randomly generated 9000 ints, The result was (lower is better):
Average timings for 100 runs:
intersection_of: 8.16 ms
sortAndIntersect: 18.38 ms
Analysis of using the std::set_intersection method:
Sorting List 1 of size N is: O(Nlog(N))
Sorting List 2 of size M is: O(Mlog(M))
Finding the Intersection is: O(M + N)
Total: O( Nlog(N) + Mlog(M) + M + N) ...(generalized as logarithmic)
Assuming M and N are equal, we can generalize it as: O(Nlog(N))
But if we use the intersection_of method I posted above:
Iterating through List 1 of size N and adding to the set is: O(N) + O(1) = O(N)
Iterating through List 2 of size M, checking the multiset, adding to out, removing from List 2 : O(M) + O(1) + O(1) + O(1) = O(M)
Total: O(M + N) ...(generalized as linear)
Assuming M and N are equal, we can generalize it as: O(N)