Fastest way to check if element is in both vectors - c++

So, think that we have two vectors, vec1 and vec2. What would be the fastest way to only perform some operation to elements, which are in both vectors.
This far, I have made this. Simply, how can we achieve this faster, or is there any way:
vector<Test*> vec1;
vector<Test*> vec2;
//Fill both of the vectors, with vec1 containing all existing
//objects of Test, and vec2 containing some of them.
for (Test* test : vec1){
//Check if test is in vec2
if (std::find(vec2.begin(), vec2.end(), test) != vec2.end){
//Do some stuff
}
}

Your approach is O(M*N) because it calls std::find linear in the number of elements of vec2 for each element of vec1. You can improve upon it in several ways:
Sorting vec2 would let you reduce the time to O((N+M)*Log M) - i.e. you can use binary search on the range vec2.begin(), vec2.end()
Sorting both vectors would let you search in O(NLog N + MLog M) - you could use an algorithm similar to merging sorted ranges to find matching pairs in linear time
Using a hash set for vec2 element would let you reduce the time to O(N+M) - now both the construction time of the set and the search in it are linear.

One easy way is to std::unordered_set
vector<Test*> vec1;
vector<Test*> vec2;
//Fill both of the vectors, with vec1 containing all existing
//objects of Test, and vec2 containing some of them.
std::unordered_set<Test*> set2(vec2.begin(),vec2.end());
for (Test* t : vec1) {
//O(1) lookup in hash set
if (set2.find(t)!=set2.end()) {
//stuff
}
}
O(n+m), where n is the number of elements in vec1, m is the number of elements in vec2
}

Related

How to group a vector pair by the second value efficiently?

I am trying to group the pair vector vector<pair<int,int>> by the second value of it. For example, if the pair is v0 : (0,1),(1,1),(3,2),(4,2),(5,1). I want to get two outputs. The first one is the unique element of the second elements, which is
vector<int> v2={1,2};
The second is groups of the first elements, which could be
vector<vector<int>>v1;
v1[0]={0,1,5};
v1[1]={3,4};
How to achieve this in a efficient way? Do I need to sort the v0 by the second element at first before the group process? Does std::map is a faster way? Not only the method, I also concern about the speed. Because my v0 is a very long and unsorted triangle mesh vertices index list. Any suggestion will be appreciate.
Updated, I found one solution similar to link. It is in an unsorted way. I have no idea about its speed.
map<int, vector<int> > vpmap;
for (auto it = v0.begin(); it != v0.end(); ++it) {
vpmap[(*it).second].push_back((*it).first);
};
in which, vpmap.first is corresponding to v2; and vpmap.second is corresponding to v1.
What you have is a reasonably performant way of getting the exact data structures you're looking for. Be sure you pre-allocate the vectors since you know the size, and use move iterators to avoid unnecessary copying:
std::vector<int> v0;
std::vector<std::vector<int>> v1;
v0.reserve(vpmap.size());
std::transform(vpmap.begin(), vpmap.end(), std::back_inserter(v0), [](auto p) { return p.first; });
v1.reserve(vpmap.size());
std::transform(make_move_iterator(vpmap.begin()), make_move_iterator(vpmap.end()), std::back_inserter(v1), [](auto p) { return p.second; });
If you can loosen your constraints, do think about big-picture optimizations like "do I need to transform all this data?"
But once you have something reasonable, stop worrying about the fastest techniques or containers or whatever, and start measuring with a profiler. Sometimes the stuff you worry about winds up being a non-issue and there are non-obvious costs that stem from your problem domain and input data and accumulation of code

C++ library method for intersection of two unordered_set

I have two unordered_set and want the intersection of those. I can't find a library function to do that.
Essentially, what I want is this:
unordered_set<int> a = {1, 2, 3};
unordered_set<int> b = {2, 4, 1};
unordered_set<int> c = a.intersect(b); // Should be {1, 2}
I can do something like
unordered_set<int> c;
for (int element : a) {
if (b.count(element) > 0) {
c.insert(element);
}
}
but I think there should be a more convenient way to do that? If there's not, can someone explain why? I know there is set_intersection, but that seems to operate on vectors only?
Thanks
In fact, a loop-based solutions is the best thing you can use with std::unordered_set.
There is an algorithm called std::set_intersection which allows to find an intersection of two sorted ranges:
Constructs a sorted range beginning at d_first consisting of elements
that are found in both sorted ranges [first1, last1) and [first2,
last2).
As you deal with std::unordered_set, you cannot apply this algorithm because there is no guaranteed order for the elements in std::unordered_set.
My advice is to stick with loops as it explicitly says what you want to achieve and has a linear complexity (O(N), where N is a number of elements in the unordered set you traverse with a for loop) which is the best compexity you might achieve.
There is a function from std called set_intersection. However, it would have a very high complexity using it with std::set as input parameter.. A better solution is, create two vectors from those sets and use set_intersection with vectors as input parameters.

C++: Efficient way to check if elements in a vector are greater than elements in another having same indices?

I have a vector < vector <int> > like so:
v = {{1,2,3}, {4,2,1}, {3,1,1}....}}
All v's elements like v[0], v[1], v[2]... have the same size. There may be duplicate elements.
What I am trying to do is to find and delete vectors (like v[2]) that are "majorized" by another vector (like v[1]), i.e. all elements of v[1] are greater than/equal to the respective elements(in order of indices) in v[2].
A naive way of doing this would be to loop thorough v and compare each vector with another vector and further compare each element with another vector's element.
But I feel there must a better way to do this without getting O(n^3) in the number of elements of all the vectors in v.
If multiple vectors are equal, I need only one of them (i.e delete all except one). A random choice would be sufficient.
Any thoughts or ideas are appreciated!
This is called the maxima of a point set. For two and three dimensions, this can be solved in O(n log n) time. For more than three dimensions, this can be solved in O(n(log n)^(d − 3)  log log n) time. For random points, a linear expected time algorithm is available.

how to concatenate Vectors in Eigen?

I have two vectorXd in my program and I like to concatenate them into one vector, so that the second one's values goes after the first one, I found this for matrix but it doesn't seem to work on Vectors:
Eigen how to concatenate matrix along a specific dimension?
Like so, assuming you have vec1 and vec2 already:
VectorXd vec_joined(vec1.size() + vec2.size());
vec_joined << vec1, vec2;
(Note that the vector types are simply typedefs of matrix types constrained to have only one column.)
Further reading: Advanced initialization

Extract element from 2 vectors?

I have 2 vector of with one has vec1{e1,e2,e3,e4} and the other one with vec2 {e2,e4,e5,e7}
How to effectively get three vector from above vectors such that 1.has elements that is available only in vec1 similarly 2 has only vec2 elements and 3.with common elements
std::set_intersection should do the trick, if both vectors are sorted:
http://msdn.microsoft.com/en-us/library/zfd331yx.aspx
std::set_intersection(vec1.begin(), vec1.end(), vec2.begin(), vec2.end(), std::back_inserter(vec3));
A custom predicate can be used for the comparison too:
std::set_intersection(vec1.begin(), vec1.end(), vec2.begin(), vec2.end(), std::back_inserter(vec3), my_equal_functor());
If they are not sorted, you may of course sort them first, or alternatively, you can iterate through vec1, and for each element, use std::find to see if it exists in vec2.
What you're asking for is that vec3 be the intersection of the other two. Jalf demonstrates how to populate vec3 using the std::set_intersection function from the <algorithm> header. But remember that for the set functions to work, the vectors must be sorted.
Then you want vec1 and vec2 to be the difference between themselves and vec3. In set notation:
vec1 := vec1 \ vec3;
vec2 := vec2 \ vec3;
You can use the std::set_difference function for that, but you can't use it to modify the vectors in-place. You'd have to compute another vector to hold the difference:
std::vector<foo> temp;
std::set_difference(vec1.begin(), vec1.end(),
vec3.begin(), vec3.end(),
std::back_inserter(temp));
vec1 = temp;
temp.clear();
std::set_difference(vec2.begin(), vec2.end(),
vec3.begin(), vec3.end(),
std::back_inserter(temp));
vec2 = temp;
If the element count is low, you can use the naive approach which is easy to implement and has O(n2) running time.
If you have a large number of elements, you can build a hash table from one of them and look up other vector's elements in it. Alternatively, you could sort one of them and binary search through it.
The problem you describe is vector intersection. This depends on the size of the input vectors.
If the sizes of both vectors are close to each other a merge (like in merge-sort) is best. If one vector is much smaller than the other do the following: For each element of the smaller vector search for that element in the larger vector using binary search.
This is a common problem in information retrieval, where you have to intersect inverted indices. There are some research papers on this.